To read the full-text of this research, you can request a copy directly from the authors.
Abstract
Citation counts are widely used as indicators of research quality to support or replace human peer review and for lists of top cited papers, researchers, and institutions. Nevertheless, the relationship between citations and research quality is poorly evidenced. We report the first large‐scale science‐wide academic evaluation of the relationship between research quality and citations (field normalized citation counts), correlating them for 87,739 journal articles in 34 field‐based UK Units of Assessment (UoA). The two correlate positively in all academic fields, from very weak (0.1) to strong (0.5), reflecting broadly linear relationships in all fields. We give the first evidence that the correlations are positive even across the arts and humanities. The patterns are similar for the field classification schemes of Scopus and Dimensions.ai, although varying for some individual subjects and therefore more uncertain for these. We also show for the first time that no field has a citation threshold beyond which all articles are excellent quality, so lists of top cited articles are not pure collections of excellence, and neither is any top citation percentile indicator. Thus, while appropriately field normalized citations associate positively with research quality in all fields, they never perfectly reflect it, even at high values.
To read the full-text of this research, you can request a copy directly from the authors.
... This has led, in part, to the emergence of the field of scientometrics, with a focus on quantitative research evaluation, and many attempts to assess whether and when citation-based indicators could inform or replace human judgement. The consensus now seems to be that these indicators can inform human judgment in the health and physical sciences and to a weak extent in the social sciences and engineering, but not in the arts and humanities (e.g., Thelwall et al., 2023a). For this to be useful and relatively fair, at least three years of citation data may be needed (Wang, 2013), which is another substantial limitation in practice. ...
... In comparison to using field normalised citation counts as research quality indicators (Thelwall et al., 2023a), the correlations in the current article are higher for 28 out of 34 UoAs and highest overall for 25 out of 34 ( Figure 9). Again, the correlations are not directly comparable for the same reasons as above, and ChatGPT has the advantage that it can be applied to recent articles. ...
... Figure 9. A comparison between the ChatGPT correlations from the current article (selective by department for most UoAs), machine learning predictions mainly leveraging citation data (Thelwall et al., 2023b) and field normalised citation counts (Thelwall et al., 2023a), both on the same REF2021 dataset covering all departments but excluding articles published after 2018 (due to insufficiently mature citation data). ...
Time spent by academics on research quality assessment might be reduced if automated approaches can help. Whilst citation-based indicators have been extensively developed and evaluated for this, they have substantial limitations and Large Language Models (LLMs) like ChatGPT provide an alternative approach. This article assesses whether ChatGPT 4o-mini can be used to estimate the quality of journal articles across academia. It samples up to 200 articles from all 34 Units of Assessment (UoAs) in the UK's Research Excellence Framework (REF) 2021, comparing ChatGPT scores with departmental average scores. There was an almost universally positive Spearman correlation between ChatGPT scores and departmental averages, varying between 0.08 (Philosophy) and 0.78 (Psychology, Psychiatry and Neuroscience), except for Clinical Medicine (rho=-0.12). Although other explanations are possible, especially because REF score profiles are public, the results suggest that LLMs can provide reasonable research quality estimates in most areas of science, and particularly the physical and health sciences and engineering, even before citation data is available. Nevertheless, ChatGPT assessments seem to be more positive for most health and physical sciences than for other fields, a concern for multidisciplinary assessments, and the ChatGPT scores are only based on titles and abstracts, so cannot be research evaluations.
... Despite the ostensible logic behind field normalisation, and many descriptive studies introducing or comparing indicators (e.g., Lundberg, 2007;Purkayastha et al., 2019;Waltman et al., 2012) or evaluating individual indicators against expert scores (Thelwall et al., 2023a), almost no study has systematically compared different indicators against a gold standard to demonstrate which is best or whether field normalisation is necessary. One partial exception is a small-scale comparison of indicators for 125 cell biology or immunology papers, where the gold standard was post-publication expert scores from F1000 website. ...
... This is not ideal, however, since it is known that citations have substantially different values as research quality indicators, depending on the field. For example, they are relatively strong in the health, life and physical sciences but are weak in some social sciences and most arts and humanities (Thelwall et al., 2023a). ...
This article compares (1) citation analysis with OpenAlex and Scopus, testing their citation counts, document type/coverage and subject classifications and (2) three citation-based indicators: raw counts, (field and year) Normalised Citation Scores (NCS) and Normalised Log-transformed Citation Scores (NLCS). Methods (1&2): The indicators calculated from 28.6 million articles were compared through 8,704 correlations on two gold standards for 97,816 UK Research Excellence Framework (REF) 2021 articles. The primary gold standard is ChatGPT scores, and the secondary is the average REF2021 expert review score for the department submitting the article. Results: (1) OpenAlex provides better citation counts than Scopus and its inclusive document classification/scope does not seem to cause substantial field normalisation problems. The broadest OpenAlex classification scheme provides the best indicators. (2) Counterintuitively, raw citation counts are at least as good as nearly all field normalised indicators, and better for single years, and NCS is better than NLCS. (1&2) There are substantial field differences. Thus, (1) OpenAlex is suitable for citation analysis in most fields and (2) the major citation-based indicators seem to work counterintuitively compared to quality judgements. Field normalisation seems ineffective because more cited fields tend to produce higher quality work, affecting interdisciplinary research or within-field topic differences.
... RQ5: The positive correlations with citation counts in all fields found above echo the positive correlations between fields normalised citation counts and the article-level expert quality scores of REF2021 (Thelwall et al., 2023). This is consistent with, but does not prove, the hypothesis that ChatGPT scores tend to positively associate with article quality. ...
... The positive Medicine correlation is unexpected given the negative correlation between REF departmental scores and ChatGPT scores for the UK REF Clinical Medicine category previously found (Thelwall & Yaghi, 2024). This may indicate substantially different scopes for the two categories, a departmental selection effect in the previous study that selected articles from a few departments, or a UK anomaly, but probably not an underlying difference between citations and research quality in this field, given the known positive correlation between these two variables (Thelwall et al., 2023). ...
Some research now suggests that ChatGPT can estimate the quality of journal articles from their titles and abstracts. This has created the possibility to use ChatGPT quality scores, perhaps alongside citation-based formulae, to support peer review for research evaluation. Nevertheless, ChatGPT's internal processes are effectively opaque, despite it writing a report to support its scores, and its biases are unknown. This article investigates whether publication date and field are biasing factors. Based on submitting a monodisciplinary journal-balanced set of 117,650 articles from 26 fields published in the years 2003, 2008, 2013, 2018 and 2023 to ChatGPT 4o-mini, the results show that average scores increased over time, and this was not due to author nationality or title and abstract length changes. The results also varied substantially between fields, and first author countries. In addition, articles with longer abstracts tended to receive higher scores, but plausibly due to such articles tending to be better rather than due to ChatGPT analysing more text. Thus, for the most accurate research quality evaluation results from ChatGPT, it is important to normalise ChatGPT scores for field and year and check for anomalies caused by sets of articles with short abstracts.
... The citation analysis of the documents provides information on the quality of the published document. A publication with a higher citation metric indicates that the quality of the document is high and has been cited by many researchers [90]. The citation relationship network between the 73 included studies on FDW was mapped as presented in Figure 3. ...
... The citation analysis of the documents provides information on the quality of the published document. A publication with a higher citation metric indicates that the quality of the document is high and has been cited by many researchers [90]. The citation relationship network between the 73 included studies on FDW was mapped as presented in Figure 3. [12,17,23,43,48,49,69,74]. ...
Foreign domestic workers (FDWs) face challenges that impact their psychosocial well-being and health behaviours. This study utilized bibliometric analyses to examine research trends on the psychosocial and health-related behaviours of FDWs in the Asia Pacific region. The bibliometric analysis comprised citation analysis and co-occurrence analysis. A systematic literature search in academic databases, including Scopus, identified 73 relevant articles published from 1996 to 2023. The growth trend revealed a steady increase in the number of publications on FDWs’ psychosocial and health-related behaviours in Asia over the years, with significant growth from 2018 to 2023, indicating an increasing interest in this research area. The citation analysis identified influential studies, active authors, and sources with high publication numbers in this research area. The analysis also examined the geographical distribution of studies, identifying the countries and organizations in Asia that contributed significantly to FDW research. The co-occurrence analysis of keywords identified key themes and concepts in the literature. The most active keywords identified include “COVID-19”, “Depression”, “Foreign Domestic Workers”, “Mental Health”, and “Quality of Life”. In conclusion, this study provides a comprehensive understanding of the current trends and state of knowledge on the psychosocial and health-related behaviours of FDWs in the Asia Pacific region.
... The citation-based indicator used was the Normalised Log-transformed Citation Score (NLCS) (Thelwall et al., 2023). This is a field and year normalised indicator. ...
Although citation-based indicators are widely used for research evaluation, they are not useful for recently published research, reflect only one of the three common dimensions of research quality, and have little value in some social sciences, arts and humanities. Large Language Models (LLMs) have been shown to address some of these weaknesses, with ChatGPT 4o-mini showing the most promising results, although on incomplete data. This article reports by far the largest scale evaluation of ChatGPT 4o-mini yet, and also evaluates its larger sibling ChatGPT 4o. Based on comparisons between LLM scores, averaged over 5 repetitions, and departmental average quality scores for 107,212 UK-based refereed journal articles, ChatGPT 4o is marginally better than ChatGPT 4o-mini in most of the 34 field-based Units of Assessment (UoAs) tested, although combining both gives better results than either one. ChatGPT 4o scores have a positive correlation with research quality in 33 of the 34 UoAs, with the results being statistically significant in 31. ChatGPT 4o scores had a higher correlation with research quality than long term citation rates in 21 out of 34 UoAs and a higher correlation than short term citation rates in 26 out of 34 UoAs. The main limitation is that it is not clear whether ChatGPT leverages public information about departmental research quality to cheat with its scores. In summary, the results give the first large scale evidence that ChatGPT 4o is competitive with citations as a new research quality indicator, but ChatGPT 4o-mini, which is more cost-effective.
... These anchoring effects reflect the power of classification as described by Bourdieu, where a single dominant measure can disproportionately influence the outcome of evaluations. The reliance on these quantifications, such as citations, as indicators of quality creates a self-reinforcing loop in which categories become deeply embedded in academic evaluation systems, as observed in the study of citations as indicators of research quality by Thelwall, Kousha, and Stuart (2023). ...
Why do two reviewers evaluate the same research so differently? The book explores the hidden mechanisms behind academic gatekeeping, uncovering how reviewers’ judgments shift depending on their underlying logic—whether based on truth-seeking, scholarly reputation, or rigid metrics. By focusing on cases with conflicting outcomes and inconsistencies in standards, it offers a rare glimpse into the complex and (sometimes) unpredictable world of academic promotion. This research not only dissects academic practices within Polish sociology but also provides a broader understanding of how global pressures reshape local scientific communities.
... The use of peer review outcomes for investigating the validity of metrics has a long tradition in scientometrics. An overview of the studies can be found in Bornmann et al. (2009), Bornmann and Tekles (2021), and Thelwall et al. (2023; the problems that are associated with these studies are discussed by Aksnes, Langfeldt, and Wouters (2019). In Wang (2024), we miss such empirical results on the validity of the discussed metrics (based on assessments by peers). ...
... Using citations as performance indicators can trace its theoretical roots to the normative view which holds that citations are made to credit scientific contributions, reflecting the intellectual footprints of publications (Aksnes et al., 2019). Such a position has been further empirically supported, with studies demonstrating a strong correlation between citation counts and other quality measures such as peers' qualitative assessment (e.g., Bornmann & Leydesdorff, 2015;Thelwall et al., 2023). Nevertheless, applying social constructivist theory introduces a different perspective, wherein citing is a social process that engages with "struggles, rhetorics, tactical and strategic games" (Aksnes, 2005, p.14). ...
This study investigated factors influencing the citations of highly cited applied linguistics research over two decades. With a pool of 302 of the top 1% most cited articles in the field, we identified 11 extrinsic factors that were independent of scientific merit but could significantly predict citation counts, including journal-related, author-related, and article-related features. Specifically, the results of multiple linear regression models showed that the time-normalized article citations were significantly predicted by the number of authors, subfield, methodology, title length, CiteScore, accessibility, and scholar h-index. The remaining factors did not exhibit any statistical significance, including the number of references, funding, internationality, and geographical origin. The combined predictive power of all these factors (R²=.208, p<.05) verifies the role of nonscientific factors contributing to high citations for applied linguistics research. These results encourage applied linguistics researchers and practitioners to recognize the underlying forces affecting research impact and highlight the need for a reward system that exclusively favors sound academic practices.
... Limiting reviews to documents with more citations-a sign of greater "influence"-is a practice occasionally used in reviews to cope with large samples (Hiebl, 2023). However, while citation numbers seem to be generally related to research quality (Thelwall et al., 2023), this relation does not necessarily hold at the level of individual studies. Thus, citation count filters will inevitably miss out on otherwise relevant, yet potentially unusual and innovative results (Thelwall, 2017) that SLR readers also expect to be synthesized. ...
We introduce a comprehensive guide to improve literature sampling in systematic literature reviews (SLRs) and meta-analyses (MAs) in management and adjacent social science disciplines. Analyzing 404 SLRs and MAs from top-tier management-related journals and the 135 methodological studies on which they relied revealed significant opportunities to improve sampling quality. We found that inadequate guidance and an overreliance on unfounded "best practices" often leads to suboptimal search strategies and incomplete reporting. To improve literature sampling, we have developed a guide that details four sequenced and prioritized scoping, searching, screening, and reporting steps. The guide also includes recommendations for the most suitable literature databases to search with, suggested (AI-enabled) tool support, and two checks to validate the effectiveness of search strategies. We created this guide by adapting the state-of-the-art recommendations of the Cochrane Handbook and PRISMA-the widely acknowledged benchmarks in methodical and reporting guidance-and supplementing them with additional evidence-based methodological insights. By adopting the guide, authors can ensure their sampling is effective and meets the high standards of SLRs and MAs, thereby likely boosting the impact of their review. The guide is applicable to all review types that require comprehensive, transparent, and reproducible sampling, including bibliometric studies and integrative reviews.
... The citation count is an important indicator of an article's or journal's effect [30]. On the basis of total citations (TCs) and citations per year (TC/Y), the top 10 publications in clean cooking research in India are displayed in Table 5. ...
The clean cooking challenge in India is deeply intertwined with socioeconomic, cultural, and infrastructural barriers. Despite government initiatives, a significant portion of the population still depends on solid biomass for cooking, leading to severe health, environmental, and social consequences. Energy poverty limits access to affordable clean fuels, especially for low-income households, creating a complex scenario where economic growth alone does not drive a shift toward cleaner cooking options. Our study identifies three primary research themes in clean cooking: accessibility in rural areas, advancements in cookstove technology, and health impacts of indoor air pollution, particularly for vulnerable groups. This research aligns closely with SDG 7 (Affordable & Clean Energy) and contributes to SDGs 3 (Health) and 13 (Climate Action), highlighting the broader benefits of transitioning to cleaner energy sources. However, gaps remain in addressing SDGs 4 (Education), 16 (Institutions), and 17 (Partnerships), pointing to a need for greater policy integration and collaborative efforts. The case studies further illustrate challenges and disruptions caused by COVID-19, including setbacks in clean fuel programs and increased indoor pollution, emphasizing the urgency for supportive policies. India’s clean cooking policies have implications not only for its socioeconomic and environmental health but also for setting a regional benchmark in sustainable energy, positioning the country as a potential leader in clean energy initiatives across Asia. India’s clean cooking policies have implications beyond domestic benefits, potentially serving as a model for sustainable energy practices across Asia and bolstering global progress toward SDG targets.
... To ensure the quality of the screening results, The researchers focused on the number of citations in the papers, as citations can often indicate a paper's quality (Thelwall et al., 2022). Both researchers used identical screening steps and methods, compared their results, and identified 26 articles presenting differing perspectives. ...
This systematic review critically evaluates the application of new technologies such as digital twins, digital platforms, smart contracts, and supply chain control towers in supply chain digital transformation (SCDT). It aims to fill the gap in the existing literature by exploring their roles, challenges, and future trends, thereby providing academia and practitioners with strategic insights for successful implementation and enhanced digital project outcomes. Employing the PRISMA framework, this study scrutinizes a select cohort of 73 out of 10,847 articles from the Scopus and Web of Science (WOS) databases over the past decade. It systematically categorizes and analyzes the literature to trace publication trends, geographical distribution, current status and keywords, and future trends. The review identifies several new technologies increasingly pivotal in modern supply chains. The researchers delve into less explored areas, such as digital twins, smart contracts, and supply chain control towers (SSCT), highlighting their potential to revolutionize digital transformation. The analysis reveals these technologies as new trends with significant implications for future supply chain management (SCM). This review provides new insights into uses and predicts trends in technology development. It offers a forward-looking perspective that lays a foundational framework for subsequent research and practical deployments in digital transformation.
... The top quartile of each Scopus ranking contains the ten journals that receive the most references when discussing mathematical misconceptions. There is a considerable correlation between journal quality and citation impact, as evidenced by the relationship between first-quartile position and high citation counts (Thelwall et al., 2023). According to this theory, journals in the top quartile receive more citations, which is indicative of their standing and influence within the field of education. ...
The purpose of this study is to conduct a bibliometric analysis of the research published in the field of mathematics misconception from 1947 to 2023, to determine the general knowledge structure and participation in research publication. An analytical approach was used based on Scopus database data. This study used mixed methods; quantitative method to summarize the articles using bibliometric analysis, and qualitative method to analyze the content of the most cited papers on mathematics misconception. The results showed that research publications on mathematics misconceptions have increased over time. The majority of the researchers and educational institutions who published papers about mathematics misconceptions were from the USA, England, and Turkey. The most used keywords were teaching, students, and education. The qualitative analysis identified (23) common mathematics misconceptions, which were grouped into four categories: general mathematics misconception, algebraic mathematics misconception, trigonometric mathematics misconception, and calculus mathematics misconception.
... Regretfully, in too many cases, research success has been defined "operationally" as simply amounting to the score of the proposed index (Harnad, 2009). With significant progress in the current century (Aksnes et al., 2019;Tahamtan & Bornmann, 2019;Waltman, 2016), many studies have demonstrated that citation-based metrics are the most convenient indicators for research evaluation (Aksnes et al., 2019;Waltman, 2016) if they are used at a high aggregation level (Aksnes et al., 2023;Thelwall et al., 2023). ...
Purpose
To analyze the diversity of citation distributions to publications in different research topics to investigate the accuracy of size-independent, rank-based indicators. The top percentile-based indicators are the most common indicators of this type, and the evaluations of Japan are the most evident misjudgments.
Design/methodology/approach
The distributions of citations to publications from countries and journals in several research topics were analyzed along with the corresponding global publications using histograms with logarithmic binning, double rank plots, and normal probability plots of log-transformed numbers of citations.
Findings
Size-independent, top percentile-based indicators are accurate when the global ranks of local publications fit a power law, but deviations in the least cited papers are frequent in countries and occur in all journals with high impact factors. In these cases, a single indicator is misleading. Comparisons of the proportions of uncited papers are the best way to predict these deviations.
Research limitations
This study is fundamentally analytical, and its results describe mathematical facts that are self-evident.
Practical implications
Respectable institutions, such as the OECD, the European Commission, and the U.S. National Science Board, produce research country rankings and individual evaluations using size-independent percentile indicators that are misleading in many countries. These misleading evaluations should be discontinued because they can cause confusion among research policymakers and lead to incorrect research policies.
Originality/value
Studies linking the lower tail of citation distribution, including uncited papers, to percentile research indicators have not been performed previously. The present results demonstrate that studies of this type are necessary to find reliable procedures for research assessments.
... While citations should not be the sole measure of importance and quality, they do serve as indicators of academic scholarship when evaluated in a balanced manner' (Davidson et al. 2014). Additionally, large-scale studies have shown a positive association between citation counts and overall research quality in nursing (Thelwall et al. 2023). However, as noted earlier, the likelihood of being cited is not solely dependent on research quality; it can also be influenced by positive study outcomes and the perceived authority of the authors, leading to citation biases (Urlings et al. 2021). ...
Aim
To assess the top 1000 cited nursing articles in terms of their impact, conceptual and social characteristics.
Design
Bibliometric literature review design.
Methods
A bibliometric analysis on the 1000 most cited nursing articles in English, focusing on assessing their impact and prevalent terms, keywords, co‐occurrence networks and topic trends. Non‐parametric statistical tests were used.
Data Sources
Web of Science Core Collection (accessed 14 February 2024).
Results
The 1000 most cited articles were exported from 201,310 eligible articles.
The most cited articles received a total of 319,643 citations. The Journal of Advanced Nursing and the International Journal of Nursing Studies were the most cited journals. Literature reviews accounted for 21% of the most cited articles, compared to only 7% of all eligible articles. Most first authors were female, 63%. The data showed an increase in female first authorship among the most cited articles over time. This may reflect a shift towards greater gender equity in nursing research. Shorter article titles and fewer article pages were associated with more citations.
Conclusion
Methodological and conceptual articles received the most citations, likely due to their broad applicability (e.g., across disciplines) and enduring relevance. There was a statistically significant correlation between article brevity and citation count, but the relationship should be viewed with caution given the small effect size.
Implications for the Profession
Bibliometrics is important for evidence‐based practice because it helps nurses evaluate journals, articles and research topics. Since citation counts do not always indicate research quality, nurses and nursing students would benefit from training in bibliometrics to enhance their critical thinking in this area.
Impact
Top‐cited nursing articles indicate influential research topics and methods. They also influence authors' academic career opportunities, allowing assessment of research equity in terms of dominant countries and author gender representation.
Reporting Method
The Preferred Reporting Items for Bibliometric Analysis (PRIBA) guidelines.
Patient or Public Contribution
No Patient or Public Contribution.
... Uniform evaluation procedures were perceived to be at odds with research practices in many fields, and the rise of societal expectations of utility came into conflict with some fields' internal preferences. These tensions triggered criticisms of inadequate assessment procedures and criteria, which in turn led to the question of what adequate procedures and criteria could look like (Aksnes and Rip 2009;Hug, Ochsner and Daniel 2013;Belcher et al. 2016;Krull and Tepperwien 2016;Ochsner, Hug and Daniel 2016;Aksnes, Langfeldt and Wouters 2019;Franssen 2022;Balaban and de Jong 2023;Thelwall et al. 2023). Empirical research has also contributed interesting insights into how quality criteria are negotiated and applied in different social contexts (Langfeldt 2001;Lamont 2009; Barl€ osius and Blem 2021). ...
Studies on research quality criteria and their application have largely not defined the concept of ‘research quality’. This reluctance to define and theoretically position the concept of research quality consigns empirical research to remain descriptive and makes methodological decisions more difficult. This paper aims to propose a theoretical approach to research quality by presenting a definition, grounding it in social theory, illustrating its applicability and exploring its methodological consequences for empirically investigating notions of research quality held by members of scientific communities.
... Firstly, quality standards for scientific papers should derive from human experts. While citation count is a common proxy for quality, it has limitations (Thelwall et al., 2023a). Highly cited articles may not necessarily reflect true quality, and some low-cited papers might be undervalued. ...
... On the other hand, despite all the benefits of increased academic impact to journals, there is a nonnegligible problem in the evaluation of journals that citation indicators essentially characterize the impact of journals rather than their disruptive innovation [9]. Relevant studies have confirmed that the level of disruptive innovation of scientific research is getting increasingly lower [10] and the progress in various disciplines is slowing [11]. ...
Background
As an important platform for researchers to present their academic findings, medical journals have a close relationship between their evaluation orientation and the value orientation of their published research results. However, the differences between the academic impact and level of disruptive innovation of medical journals have not been examined by any study yet.
Objective
This study aims to compare the relationships and differences between the academic impact, disruptive innovation levels, and peer review results of medical journals and published research papers. We also analyzed the similarities and differences in the impact evaluations, disruptive innovations, and peer reviews for different types of medical research papers and the underlying reasons.
Methods
The general and internal medicine Science Citation Index Expanded (SCIE) journals in 2018 were chosen as the study object to explore the differences in the academic impact and level of disruptive innovation of medical journals based on the OpenCitations Index of PubMed open PMID-to-PMID citations (POCI) and H1Connect databases, respectively, and we compared them with the results of peer review.
Results
First, the correlation coefficients of the Journal Disruption Index (JDI) with the Journal Cumulative Citation for 5 years (JCC5), Journal Impact Factor (JIF), and Journal Citation Indicator (JCI) were 0.677, 0.585, and 0.621, respectively. The correlation coefficient of the absolute disruption index (Dz) with the Cumulative Citation for 5 years (CC5) was 0.635. However, the average difference in the disruptive innovation and academic influence rankings of journals reached 20 places (about 17.5%). The average difference in the disruptive innovation and influence rankings of research papers reached about 2700 places (about 17.7%). The differences reflect the essential difference between the two evaluation systems. Second, the top 7 journals selected based on JDI, JCC5, JIF, and JCI were the same, and all of them were H-journals. Although 8 (8/15, 53%), 96 (96/150, 64%), and 880 (880/1500, 58.67%) of the top 0.1%, top 1%, and top 10% papers selected based on Dz and CC5, respectively, were the same. Third, research papers with the “changes clinical practice” tag showed only moderate innovation (4.96) and impact (241.67) levels but had high levels of peer-reviewed recognition (6.00) and attention (2.83).
Conclusions
The results of the study show that research evaluation based on innovative indicators is detached from the traditional impact evaluation system. The 3 evaluation systems (impact evaluation, disruptive innovation evaluation, and peer review) only have high consistency for authoritative journals and top papers. Neither a single impact indicator nor an innovative indicator can directly reflect the impact of medical research for clinical practice. How to establish an integrated, comprehensive, scientific, and reasonable journal evaluation system to improve the existing evaluation system of medical journals still needs further research.
... However, advocates for responsible evaluation practices urge moving on from using citations in research assessment, arguing they provide an incomplete picture of research quality (Coalition for Advancing Research Assessment, 2023). Indeed, citations correlated only weakly with peer-assessed article quality in the humanities and social sciences (r=0.1-0.3) and moderately in the medical and life sciences and engineering fields (r=0.3-0.5), using data from the United Kingdom's Research Evaluation Framework (REF; Thelwall et al., 2023a). This suggests citation impact captures just one aspect of article quality as a proxy for its academic value. ...
This study investigates the viability of distinguishing articles in questionable journals (QJs) from those in non-QJs on the basis of quantitative indicators typically associated with quality. Subsequently, I examine what can be deduced about the quality of articles in QJs based on the differences observed. I contrast the length of abstracts and full-texts, prevalence of spelling errors, text readability, number of references and citations, the size and internationality of the author team, the documentation of ethics and informed consent statements, and the presence erroneous decisions based on statistical errors in 1,714 articles from 31 QJs, 1,691 articles from 16 journals indexed in Web of Science (WoS), and 1,900 articles from 45 mid-tier journals, all in the field of psychology. The results suggest that QJ articles do diverge from the disciplinary standards set by peer-reviewed journals in psychology on quantitative indicators of quality that tend to reflect the effect of peer review and editorial processes. However, mid-tier and WoS journals are also affected by potential quality concerns, such as under-reporting of ethics and informed consent processes and the presence of errors in interpreting statistics. Further research is required to develop a comprehensive understanding of the quality of articles in QJs.
... Citation counts are a conventional indicator used by scientometric analysts. Serial independent investigations confirm the generic positive relationship between peer judgments of research quality and average citation indicators (Adams, 2007;Thelwall et al., 2023;Waltman & Van Eck, 2015). ...
We examine the link between a country’s average citation impact and both national research assessment and international collaboration. Our analysis finds little synchrony between national policies and performance change. We do find extensive, synchronous, cross-national change, however, despite a diversity of national research strategies. Specifically, during 1981–2020, there are synchronous cross-national changes in bilateral, and later multilateral, collaboration. We deconstruct the citation indicators and show that the average citation impact of domestic research and of collaborative research changes little for most countries. Net increases in average national citation impact have instead been driven by rising collaboration and the emerging global network. Greater collaboration enables greater subject diversity, contributes to convergence of subjects, and influences performance indicators. Coincidentally, it also results in all large nations apparently achieving higher average impact than the world average. These effects suggest a need both to strengthen policy analysis of the global context and to construct proper performance indicators when developing research strategy.
... The strategy of promoting international collaboration is indirectly supported by evidence of the greater citation impact of internationally co-authored research (Zhou et al., 2020). This is in turn supported by evidence that more cited research tends to be higher quality, especially in the physical and health sciences (Thelwall et al., 2023c). Nevertheless, this citation advantage is at least partly an audience effect: more people aware of the work and citing it due to multiple national networks recognising the authors (Thelwall et al., 2023b;Wagner et al., 2019). ...
International collaboration is sometimes encouraged in the belief that it generates higher quality research or is more capable of addressing societal problems. Nevertheless, while there is evidence that the journal articles of international teams tend to be more cited than average, perhaps from increased international audiences, there is no science‐wide direct academic evidence of a connection between international collaboration and research quality. This article empirically investigates the connection between international collaboration and research quality for the first time, with 148,977 UK‐based journal articles with post publication expert review scores from the 2021 Research Excellence Framework (REF). Using an ordinal regression model controlling for collaboration, international partners increased the odds of higher quality scores in 27 out of 34 Units of Assessment (UoAs) and all Main Panels. The results therefore give the first large scale evidence of the fields in which international co‐authorship for articles is usually apparently beneficial. At the country level, the results suggests that UK collaboration with other high research‐expenditure economies generates higher quality research, even when the countries produce lower citation impact journal articles than the United Kingdom. Worryingly, collaborations with lower research‐expenditure economies tend to be judged lower quality, possibly through misunderstanding Global South research goals.
... It is also important to remember that priority is not the only determinant of whether a publication is subsequently cited. Although citation rates are an imperfect indicator of article quality, the positive correlation between citation rates and independent measures of research quality (27) suggests that scientists are more likely to cite a work if it is of higher quality. It is more important to be good than first. ...
Decisions involving cooperation or competition are common in science. Here, we consider three situations frequently encountered in the biomedical sciences, namely, establishing priority, sharing reagents, and selecting a journal for publication, through the lens of the prisoner's dilemma. In each situation, cooperation is the best strategy for scientists and for science.
... Despite the contentious nature of utilizing citations as a means of evaluating research, recent scholarly investigations indicate a discernible association between research quality and citation metrics, even within the realm of humanities (Thelwall et al., 2023). Consequently, the increasing number of Ukrainian A&H outputs in Scopus, coupled with the decreasing number of highly-influential source titles and poor citedness, may be considered as hidden signals about the wrong motivation of authors. ...
Purpose
This article presents the results of a quantitative analysis of Ukrainian arts and humanities (A&H) research from 2012 to 2021, as observed in Scopus. The overall publication activity and the relative share of A&H publications in relation to Ukraine's total research output, comparing them with other countries. The study analyzes the diversity and total number of sources, as well as the geographic distribution of authors and citing authors, to provide insights into the internationalization level of Ukrainian A&H research. Additionally, the topical spectrum and language usage are considered to complete the overall picture.
Design/methodology/approach
This study uses the Scopus database as the primary data source for analyzing the general bibliometric characteristics of Ukrainian A&H research. All document types, except Erratum, were considered. A language filter was applied to compare the bibliometric characteristics of English versus non-English publications. In addition to directly imported data from Scopus, the study employs the ready-to-use SciVal tools to operate with A&H subcategories and calculate additional bibliometric characteristics, such as Citations per Publication (CPP), Field-Weighted Citation Impact (FWCI) and journal quartiles. Information on the country of journal publishers and details on delisted journals from Scopus were obtained from the official Source Title List available on the Elsevier website and the SCImago Journal and Country Rank Portal.
Findings
According to the results obtained, the publication patterns for Ukrainian A&H research exhibit dynamics comparable to those of other countries, with a gradual increase in the total number of papers and sources. However, the citedness is lower than expected, and the share of publications in top-quartile sources is lower for 2020–2021 period compared to the previous years. The impact of internationally collaborative papers, especially those in English, is higher. Nevertheless, over half of all works remain uncited, probably due to the limited readership of the journals selected for publication.
Originality/value
This study provides original insights into the bibliometric characteristics of Ukrainian A&H publications between 2012 and 2021, as assessed using the Scopus database. The authors’ findings reveal that Ukraine's A&H publications have higher visibility than some Asian countries with similar population sizes. However, in comparison to other countries of similar size, Ukraine's research output is smaller. The authors also discovered that cultural and historical similarities with neighboring countries play a more significant role in publication activity than population size. This study highlights the low integration of Ukrainian A&H research into the global academic community, evident through a decline in papers published in influential journals and poor citedness. These findings underscore the importance for authors to prioritize disseminating research in influential journals, rather than solely focusing on indexing in particular databases.
... Limiting reviews to documents with more citations-a sign of greater "influence"-is a practice occasionally used in reviews to cope with large samples (Hiebl, 2023). However, while citation numbers seem to be generally related to research quality (Thelwall et al., 2023), this relation does not necessarily hold at the level of individual studies. Thus, citation count filters will inevitably miss out on otherwise relevant, yet potentially unusual and innovative results (Thelwall, 2017) that SLR readers also expect to be synthesized. ...
... Higher-quality articles tend to be more cited than others from the same field and year in all fields, at least for UK research, with the highest (and strong) correlations being in health, life sciences and physical sciences and the lowest (and weak) being in the arts and humanities (Thelwall, Kousha, Abdoli, Stuart, Makita, Wilson, & Levitt, 2023a). The overall correlations may hide the fact that citations primarily reflect the impact component of research quality, rather than the soundness and originality dimensions (Aksnes et al., 2019). ...
Identifying factors that associate with more cited or higher quality research may be useful to improve science or to support research evaluation. This article reviews evidence for the existence of such factors in article text and metadata. It also reviews studies attempting to estimate article quality or predict long‐term citation counts using statistical regression or machine learning for journal articles or conference papers. Although the primary focus is on document‐level evidence, the related task of estimating the average quality scores of entire departments from bibliometric information is also considered. The review lists a huge range of factors that associate with higher quality or more cited research in some contexts (fields, years, journals) but the strength and direction of association often depends on the set of papers examined, with little systematic pattern and rarely any cause‐and‐effect evidence. The strongest patterns found include the near universal usefulness of journal citation rates, author numbers, reference properties, and international collaboration in predicting (or associating with) higher citation counts, and the greater usefulness of citation‐related information for predicting article quality in the medical, health and physical sciences than in engineering, social sciences, arts, and humanities.
Background:
The number of systematic reviews is increasing rapidly. Several methodologies exist for systematic reviews. Cochrane Reviews follow distinct methods to ensure they provide the most reliable and robust evidence, ideally based on rigorous evaluations of randomized controlled trials and other high-quality studies. We aimed to examine the difference in citation patterns of Cochrane Reviews and other systematic reviews.
Methods:
We conducted a bibliometric analysis of systematic reviews indexed in PubMed from 1993 to 2022. We collected data on citations from The Lens from 1993 to 2023, thus having at least 1-year follow-up on citations. The reviews were linked through their PubMed identifier. Comparisons between the Cochrane Reviews and other systematic reviews included total citations per review, reviews with zero citations, and the time window within which they receive citations.
Results:
We included 10,086 Cochrane Reviews and 231,074 other systematic reviews. Other systematic reviews received significantly more citations than Cochrane Reviews from 1993 to 2007. From 1993 to 1997, the median difference was 80 citations (95% CI = 79.6-80.4). From 2008 and forward, the overall number of citations was similar between Cochrane Reviews and other systematic reviews (2018-2022: median difference = 5 [95% CI = 4.9-5.1] in favor of Cochrane Reviews; p = 0.83). Systematic reviews with zero citations were rare in both groups, but it was observed more often among other systematic reviews than Cochrane Reviews. Over the last 30 years, the time window in which all reviews received citations narrowed.
Conclusion:
In recent years, Cochrane Reviews and other systematic reviews had similar citation patterns, but other systematic reviews received more citations from 1993 to 2007. Other systematic reviews were more often never cited than Cochrane Reviews, and potentially wasted. The time window in which systematic reviews received citations has been progressively decreasing, possibly indicating a trend toward quicker recognition and uptake of these reviews within the academic community. Cochrane reviews aim to provide robust evidence, but this is not reflected in the citation metrics compared to other systematic reviews.
Purpose: Analyze the diversity of citation distributions to publications in different research topics to investigate the accuracy of size-independent, rank-based indicators. Top percentile-based indicators are the most common indicators of this type, and the evaluations of Japan are the most evident misjudgments. Design/methodology/approach: The distributions of citations to publications from countries and in journals in several research topics were analyzed along with the corresponding global publications using histograms with logarithmic binning, double rank plots, and normal probability plots of log-transformed numbers of citations. Findings: Size-independent, top percentile-based indicators are accurate when the global ranks of local publications fit a power law, but deviations in the least cited papers are frequent in countries and occur in all journals with high impact factors. In these cases, a single indicator is misleading. Comparisons of proportions of uncited papers are the best way to predict these deviations. Research limitations: The study is fundamentally analytical; its results describe mathematical facts that are self-evident. Practical implications: Respectable institutions, such as the OECD, European Commission, US National Science Board, and others, produce research country rankings and individual evaluations using size-independent percentile indicators that are misleading in many countries. These misleading evaluations should be discontinued because they cause confusion among research policymakers and lead to incorrect research policies. Originality/value: Studies linking the lower tail of citation distribution, including uncited papers, to percentile research indicators have not been performed previously. The present results demonstrate that studies of this type are necessary to find reliable procedures for research assessments.
Introduction - It is well documented that librarian involvement in systematic reviews generally increases quality of reporting and the review overall. We used bibliometric analysis methods to analyze the level of librarian involvement in systematic reviews conducted at the University of Alberta (U of A). Methods - Using Web of Science (WoS), we searched for systematic reviews completed in the years 2016-2020 with a U of A co-author. Systematic reviews identified through WoS were screened in two phases: 1) Exclusion of duplicates, protocols, other types or reviews, and systematic review methodology literature to leave true systematic review publications, 2) Screening for level of librarian involvement (acknowledgement, co-author, or no involvement). Results - 640 reviews were analyzed for the following categories: 1) librarian named as a co-author; 2) librarian named in the acknowledgements section; 3) librarian mentioned in the body of the manuscript; 4) no librarian involvement. We identified 152 reviews who named a librarian as a co-author on the paper, 125 reviews named a librarian in the acknowledgements section, and 268 reviews mentioned a librarian in the body of the review. Conclusion - There is a great deal of variation in how the work of librarians is reflected in systematic reviews. This was particularly apparent in reviews where a librarian was mentioned in the body of the review but they were not named as an author or formally acknowledged. Continuing to educate researchers about the work of librarians is crucial to fully represent the value librarians bring to systematic reviews.
Distinguishing between research collaboration, consultancy, dissemination, and commercialization of research results, this paper analyses the determinants of researchers’ societal engagement. The analytical framework integrates societal engagement as part of the credibility cycle. Several variables extend previous findings on determinants and mechanisms—herein scientific recognition and funding sources. A novel method to investigate the relationship between scientific recognition and societal engagement is explored. Drawing on a large-scale survey of European-based researchers in physics, cardiology, and economics, we find that several factors are associated with different modes of societal engagement in complex and intersecting ways. Scientific recognition is positively associated with research collaboration and dissemination, while organizational seniority is associated with all modes except for research collaboration with non-scientific actors. Female gender is positively associated with dissemination and external funding sources are positively associated will all. The findings intersect with differences in the three research fields.
Do women face a disadvantage in terms of citation rates, and if so, in what ways? This article provides a comprehensive overview of existing research on the relationship between gender and citations. Three distinct approaches are identified: (1) per‐article approach that compares gender differences in citations between articles authored by men and women, (2) per‐author approach that compares the aggregate citation records of men and women scholars over a specified period or at the career level, and (3) reference‐ratio approach that assesses the gender distribution of references in articles written by men and women. I show that articles written by women receive comparable or even higher rates of citations than articles written by men. However, women tend to accumulate fewer citations over time and at the career level. Contrary to the notion that women are cited less per article due to gender‐based bias in research evaluation or citing behaviors, this study suggests that the primary reason for the lower citation rates at the author level is women publishing fewer articles over their careers. Understanding and addressing the gender citation gap at the author level should therefore focus on women's lower research productivity over time and the contributing factors. To conclude, I discuss the potential detrimental impact of lower citations on women's career progression and the ways to address the issue to mitigate gender inequalities in science.
Purpose
Technology is sometimes used to support assessments of academic research in the form of automatically generated bibliometrics for reviewers to consult during their evaluations or by replacing some or all human judgements. With artificial intelligence (AI), there is increasing scope to use technology to assist research assessment processes in new ways. Since transparency and fairness are widely considered important for research assessment and AI introduces new issues, this review investigates their implications.
Design/methodology/approach
This article reviews and briefly summarises transparency and fairness concerns in general terms and through the issues that they raise for various types of Technology Assisted Research Assessment (TARA).
Findings
Whilst TARA can have varying levels of problems with both transparency and bias, in most contexts it is unclear whether it worsens the transparency and bias problems that are inherent in peer review.
Originality/value
This is the first analysis that focuses on algorithmic bias and transparency issues for technology assisted research assessment.
The article is devoted to the problem of improving the governance of publication systems, within which their actors interact in producing scientific publications, supplying them to readers, as well as in funding and coordinating corresponding processes. It is emphasized that the ownership of a scientific text includes two components: the right to a monetary reward for the use of the article by the consumer and authorship. The first component can be passed to another person, but the second cannot. Authorship is the basis for building up an individual intangible asset, which we call authorship capital. The desire to increase it determines the dual role of the author in the publication system: he is not only a producer of the knowledge embodied in the article, but also, along with the reader, its ultimate consumer. The dual role of the journal is also noted, which, organizing the review process, turns out to be not only a supplier of articles, but also a producer of knowledge. These two features give rise to a variety of possible financing schemes for publishing systems. The specific features of knowledge as a private and public good are analyzed. One of them is the high cost of knowledge consumption. Due to this and a number of other circumstances, the market model for financing publication systems is inefficient; the most important task is the transition to open access. Such a transition should be accompanied by improved methods for evaluating the performance of researchers and the quality of journals. The comparison of large groups of objects (e. g., journals or research institutions) is inevitably based on citation indicators, while expertise can play only a supporting role. On the contrary, when it comes to making decisions within a small group, e. g., when allocating given funds among laboratory members, expert evaluations must play a decisive role. The directions of reform of the Russian publication system are discussed, ensuring the reduction of rent-seeking activity and increasing the adequacy of the indicators used.
The Journal Impact Factor and other indicators that assess the average citation rate of articles in a journal are consulted by many academics and research evaluators, despite initiatives against overreliance on them. Undermining both practices, there is limited evidence about the extent to which journal impact indicators in any field relate to human judgements about the quality of the articles published in the field’s journals. In response, we compared average citation rates of journals against expert judgements of their articles in all fields of science. We used preliminary quality scores for 96,031 articles published 2014–18 from the UK Research Excellence Framework 2021. Unexpectedly, there was a positive correlation between expert judgements of article quality and average journal citation impact in all fields of science, although very weak in many fields and never strong. The strength of the correlation varied from 0.11 to 0.43 for the 27 broad fields of Scopus. The highest correlation for the 94 Scopus narrow fields with at least 750 articles was only 0.54, for Infectious Diseases, and there was only one negative correlation, for the mixed category Computer Science (all), probably due to the mixing. The average citation impact of a Scopus-indexed journal is therefore never completely irrelevant to the quality of an article but is also never a strong indicator of article quality. Since journal citation impact can at best moderately suggest article quality it should never be relied on for this, supporting the San Francisco Declaration on Research Assessment.
A number of indications, such as the number of Nobel Prize winners, show Japan to be a scientifically advanced country. However, standard bibliometric indicators place Japan as a scientifically developing country. The present study is based on the conjecture that Japan is an extreme case of a general pattern in highly industrialized countries. In these countries, scientific publications come from two types of studies: some pursue the advancement of science and produce highly cited publications, while others pursue incremental progress and their publications have a very low probability of being highly cited. Although these two categories of papers cannot be easily identified and separated, the scientific level of Japan can be tested by studying the extreme upper tail of the citation distribution of all scientific articles. In contrast to standard bibliometric indicators, which are calculated from the total number of papers or from sets of papers in which the two categories of papers are mixed, in the extreme upper tail, only papers that are addressed to the advance of science will be present. Based on the extreme upper tail, Japan belongs to the group of scientifically advanced countries and is significantly different from countries with a low scientific level. The number of Clarivate Citation laureates also supports our hypothesis that some citation-based metrics do not reveal the high scientific level of Japan. Our findings suggest that Japan is an extreme case of inaccuracy of some citation metrics; the same drawback might affect other countries, although to a lesser degree.
The top-1% most-highly-cited articles are watched closely as the vanguards of the sciences. Using Web of Sciencee data, one can find that China had overtaken the USA in the relative participation in the top-1% (PP-top1%) in 2019, after outcompeting the EU on this indicator in 2015. However, this finding contrasts with repeated reports of Western agencies that the quality of China's output in science is lagging other advanced nations, even as it has caught up in numbers of articles. The difference between the results presented here and the previous results depends mainly upon field normalizations, which classify source journals by discipline. Average citation rates of these subsets are commonly used as a baseline so that one can compare among disciplines. However, the expected value of the top-1% of a sample of N papers is N / 100, ceteris paribus. Using the average citation rates as expected values, errors are introduced by (1) using the mean of highly skewed distributions and (2) a specious precision in the delineations of the subsets. Classifications can be used for the decomposition, but not for the normalization. When the data is thus decomposed, the USA ranks ahead of China in biomedical fields such as virology. Although the number of papers is smaller, China outperforms the US in the field of Business and Finance (in the Social Sciences Citation Index; p<.05). Using percentile ranks, subsets other than indexing-based classifications can be tested for the statistical significance of differences among them.
Notions of research quality are contextual in many respects: they vary between fields of research, between review contexts and between policy contexts. Yet, the role of these co-existing notions in research, and in research policy, is poorly understood. In this paper we offer a novel framework to study and understand research quality across three key dimensions. First, we distinguish between quality notions that originate in research fields (Field-type) and in research policy spaces (Space-type). Second, drawing on existing studies, we identify three attributes (often) considered important for ‘good research’: its originality/novelty, plausibility/reliability, and value or usefulness. Third, we identify five different sites where notions of research quality emerge, are contested and institutionalised: researchers themselves, knowledge communities, research organisations, funding agencies and national policy arenas. We argue that the framework helps us understand processes and mechanisms through which ‘good research’ is recognised as well as tensions arising from the co-existence of (potentially) conflicting quality notions.
We analyzed how often and in what ways the Journal Impact Factor (JIF) is currently used in review, promotion, and tenure (RPT) documents of a representative sample of universities from the United States and Canada. 40% of research-intensive institutions and 18% of master’s institutions mentioned the JIF, or closely related terms. Of the institutions that mentioned the JIF, 87% supported its use in at least one of their RPT documents, 13% expressed caution about its use, and none heavily criticized it or prohibited its use. Furthermore, 63% of institutions that mentioned the JIF associated the metric with quality, 40% with impact, importance, or significance, and 20% with prestige, reputation, or status. We conclude that use of the JIF is encouraged in RPT evaluations, especially at research-intensive universities, and that there is work to be done to avoid the potential misuse of metrics like the JIF.
Citations are increasingly used as performance indicators in research policy and within the research system. Usually, citations are assumed to reflect the impact of the research or its quality. What is the justification for these assumptions and how do citations relate to research quality? These and similar issues have been addressed through several decades of scientometric research. This article provides an overview of some of the main issues at stake, including theories of citation and the interpretation and validity of citations as performance measures. Research quality is a multidimensional concept, where plausibility/soundness, originality, scientific value, and societal value commonly are perceived as key characteristics. The
article investigates how citations may relate to these various research quality dimensions. It is argued that citations reflect aspects related to scientific impact and relevance, although with important limitations. On the contrary, there is no evidence that citations reflect other key dimensions of research quality. Hence, an increased use of citation indicators in research evaluation and funding may imply less attention to these other research quality dimensions, such as solidity/plausibility, originality, and societal value.
The Independent Review of the Role of Metrics in Research Assessment and Management was set up in April 2014 to investigate the current and potential future roles that quantitative indicators can play in the assessment and management of research. Its report, ‘The Metric Tide’, was published in July 2015 and is available below.
The review was chaired by James Wilsdon, professor of science and democracy at the University of Sussex, supported by an independent and multidisciplinary group of experts in scientometrics, research funding, research policy, publishing, university management and research administration. Through 15 months of consultation and evidence-gathering, the review looked in detail at the potential uses and limitations of research metrics and indicators, exploring the use of metrics within institutions and across disciplines.
The main findings of the review include the following:
There is considerable scepticism among researchers, universities, representative bodies and learned societies about the broader use of metrics in research assessment and management.
Peer review, despite its flaws, continues to command widespread support as the primary basis for evaluating research outputs, proposals and individuals. However, a significant minority are enthusiastic about greater use of metrics, provided appropriate care is taken.
Carefully selected indicators can complement decision-making, but a ‘variable geometry’ of expert judgement, quantitative indicators and qualitative measures that respect research diversity will be required.
There is legitimate concern that some indicators can be misused or ‘gamed’: journal impact factors, university rankings and citation counts being three prominent examples.
The data infrastructure that underpins the use of metrics and information about research remains fragmented, with insufficient interoperability between systems.
Analysis concluded that that no metric can currently provide a like-for-like replacement for REF peer review.
In assessing research outputs in the REF, it is not currently feasible to assess research outputs or impacts in the REF using quantitative indicators alone.
In assessing impact in the REF, it is not currently feasible to use quantitative indicators in place of narrative case studies. However, there is scope to enhance the use of data in assessing research environments.
The review identified 20 recommendations for further work and action by stakeholders across the UK research system. They propose action in the following areas: supporting the effective leadership, governance and management of research cultures; improving the data infrastructure that supports research information management; increasing the usefulness of existing data and information sources; using metrics in the next REF; and coordinating activity and building evidence.
These recommendations are underpinned by the notion of ‘responsible metrics’ as a way of framing appropriate uses of quantitative indicators in the governance, management and assessment of research. Responsible metrics can be understood in terms of the following dimensions:
Robustness: basing metrics on the best possible data in terms of accuracy and scope
Humility: recognising that quantitative evaluation should support – but not supplant – qualitative, expert assessment
Transparency: keeping data collection and analytical processes open and transparent, so that those being evaluated can test and verify the results
Diversity: accounting for variation by field, and using a range of indicators to reflect and support a plurality of research and researcher career paths across the system
Reflexivity: recognising and anticipating the systemic and potential effects of indicators, and updating them in response.
text from: http://www.hefce.ac.uk/pubs/rereports/year/2015/metrictide/
The prediction of the long-term impact of a scientific article is challenging task, addressed by the bibliometrician through resorting to a proxy whose reliability increases with the breadth of the citation window. In the national research assessment exercises using metrics the citation window is necessarily short, but in some cases is sufficient to advise the use of simple citations. For the Italian VQR 2011–2014, the choice was instead made to adopt a linear weighted combination of citations and journal metric percentiles, with weights differentiated by discipline and year. Given the strategic importance of the exercise, whose results inform the allocation of a significant share of resources for the national academic system, we examined whether the predictive power of the proposed indicator is stronger than the simple citation count. The results show the opposite, for all discipline in the sciences and a citation window above 2 years.
Use these ten principles to guide research evaluation, urge Diana Hicks, Paul Wouters and colleagues. were introduced, such as InCites (using the Web of Science) and SciVal (using Scopus), as well as software to analyse individual citation profiles using Google Scholar (Publish or Perish, released in 2007). In 2005, Jorge Hirsch, a physicist at the University of California, San Diego, proposed the h-index, popularizing citation counting for individual researchers. Interest in the journal impact factor grew steadily after 1995 (see 'Impact-factor obsession'). Lately, metrics related to social usage advice on, good practice and interpretation. Before 2000, there was the Science Citation Index on CD-ROM from the Institute for Scientific Information (ISI), used by experts for specialist analyses. In 2002, Thomson Reuters launched an integrated web platform, making the Web of Science database widely accessible. Competing citation indices were created: Elsevier's Scopus (released in 2004) and Google Scholar (beta version released in 2004). Web-based tools to easily compare institutional research productivity and impact D
This study investigates the relationship between bibliometric indicators and the outcomes of peer reviews. Based on a case
study of research groups at the University of Bergen, Norway, we examine how various bibliometric indicators correlate with
evaluation ratings given by expert committees. The analysis shows positive but relatively weak correlations for all the selected
indicators. Particular attention is devoted to the reasons for the discrepancies. We find that shortcomings of the peers'
assessments, of the bibliometric indicators, as well as lack of comparability, can explain why the correlation was not stronger.
After a review of developments in the quantitative study of science, particularly since the early 1970s, I focus on two current
main lines of ‘measuring science’ based on bibliometric analysis. With the developments in the Leiden group as an example
of daily practice, the measurement of research performance and, particularly, the importance of indicator standardisation
are discussed, including aspects such as interdisciplinary relations, collaboration, ‘knowledge users’. Several important
problems are addressed: language bias; timeliness; comparability of different research systems; statistical issues; and the
‘theory-invariance’ of indicators. Next, an introduction to the mapping of scientific fields is presented. Here basic concepts
and issues of practical application of these ‘science maps’ are addressed. This contribution is concluded with general observations
on current and near-future developments, including network-based approaches, necessary ‘next steps’ are formulated, and an
answer is given to the question ‘Can science be measured?’
Development of bibliometric techniques has reached such a level as to suggest their integration or total substitution for
classic peer review in the national research assessment exercises, as far as the hard sciences are concerned. In this work
we compare rankings lists of universities captured by the first Italian evaluation exercise, through peer review, with the
results of bibliometric simulations. The comparison shows the great differences between peer review and bibliometric rankings
for excellence and productivity.
A citation study of the 692 staff that makes up unit of assessment 58 (archaeology), in the 2001 UK Research Assessment Exercise (RAE) was undertaken. Unlike earlier studies, which were obliged to make assumptions on who and what had been submitted for assessment, these were, for the first time available from the RAE Web site. This study, therefore, used the specific submission details of authors and their publications. Using the Spearman rank‐order correlation coefficient, all results showed high statistically significant correlation between the RAE result and citation counts. The results were significant at 0.01 per cent. The findings confirm earlier studies. Given the comparative cost and ease of citation analysis, it is recommended that, correctly applied, it should be the initial tool of assessment for the RAE. Panel members would then exercise their judgement and skill to confirm final rankings.
In this paper we present characteristics of the statistical correlation between the Hirsch (h-) index and several standard bibliometric indicators, as well as with the results of peer review judgment. We use the results of a large evaluation study of 147 university chemistry research groups in the Netherlands covering the work of about 700 senior researchers during the period 1991-2000. Thus, we deal with research groups rather than individual scientists, as we consider the research group as the most important work floor unit in research, particularly in the natural sciences. Furthermore, we restrict the citation period to a three-year window instead of life time counts in order to focus on the impact of recent work and thus on current research performance. Results show that the h-index and our bibliometric crown indicator both relate in a quite comparable way with peer judgments. But for smaller groups in fields with less heavy citation traffic the crown indicator appears to be a more appropriate measure of research performance.
In a rapidly changing and inter-disciplinary world it is important to understand the nature and generation of knowledge, and its social organization. Increasing attention is paid in the social sciences and management studies to the constitution and claims of different theories, perspectives, and 'paradigms'. This book is one of the most respected and robust analyses of these issues. For this new paperback edition Richard Whitley - a leading figure in European business education - has written a new introduction which addresses the particular epistemological issues presented by management and business studies. He approaches the sciences as differently organized systems for the production and validation of knowledge - systems which become established in particular contexts and which generate different sorts of knowledge. He identifies seven major types of scientific field and discusses the establishment and growth of these sciences, including the major consequences of the nineteenth-century expansion of employment opportunities for researchers; the competitive pursuit of public reputations; and the domination of intellectual work by employees. He also examines the divergences in the way research is organized and controlled both in different fields, and in the same field within different historical circumstances. This book will be of interest to all graduate students concerned with the social study of knowledge, science, technology, and the history and philosophy of science.
Citation analysis has been a prevalent method in the field of information science, especially research on bibliometrics and evaluation, but its validity relies heavily on how the citations are treated. It is essential to study authors’ citing motivations to identify citations with different values and significance. This study applied a meta-synthesis approach to establish a new holistic classification of citation motivations based on previous studies. First, we used a four-step search strategy to identify related articles on authors’ citing motivations. Thirty-eight primary studies were included after the inclusion and exclusion criteria were applied and appraised using the Evidence-based Librarianship checklist. Next, we decoded and recoded the citing motivations found in the included studies, following the standard procedures of meta-synthesis. Thirty-five descriptive concepts of citation motivations emerged, which were then synthesized into 13 analytic themes. As a result, we proposed a comprehensive classification, including two main categories of citing reasons, i.e., “scientific motivations” and “tactical motivations.” Generally, the citations driven by scientific motivations serve as a rhetorical function, while tactical motivations are social or benefit-oriented and not easily captured through text-parsing. Our synthesis contributes to bibliometric and scientific evaluation theory. The synthesized classification also provides a comprehensive and unified annotation schema for citation classification and helps identify the useful mentions of a reference in a citing paper to optimize citation- based measurements.
Research collaboration is promoted by governments and research funders, but if the relative prevalence and merits of collaboration vary internationally then different national and disciplinary strategies may be needed to promote it. This study compares the team size and field normalized citation impact of research across all 27 Scopus broad fields in the 10 countries with the most journal articles indexed in Scopus 2008–2012. The results show that team size varies substantially by discipline and country, with Japan (4.2) having two‐thirds more authors per article than the United Kingdom (2.5). Solo authorship is rare in China (4%) but common in the United Kingdom (27%). While increasing team size associates with higher citation impact in almost all countries and fields, this association is much weaker in China than elsewhere. There are also field differences in the association between citation impact and collaboration. For example, larger team sizes in the Business, Management & Accounting category do not seem to associate with greater research impact, and for China and India, solo authorship associates with higher citation impact in this field. Overall, there are substantial international and field differences in the extent to which researchers collaborate and the extent to which collaboration associates with higher citation impact.
The 1989 claim of ‘cold fusion’ was publicly heralded as the future of clean energy generation. However, subsequent failures to reproduce the effect heightened scepticism of this claim in the academic community, and effectively led to the disqualification of the subject from further study. Motivated by the possibility that such judgement might have been premature, we embarked on a multi-institution programme to re-evaluate cold fusion to a high standard of scientific rigour. Here we describe our efforts, which have yet to yield any evidence of such an effect. Nonetheless, a by-product of our investigations has been to provide new insights into highly hydrided metals and low-energy nuclear reactions, and we contend that there remains much interesting science to be done in this underexplored parameter space.
Link to full article: https://rdcu.be/bEAsT
Most Performance-based Research Funding Systems (PRFS) draw on peer review and bibliometric indicators, two different methodologies which are sometimes combined. A common argument against the use of indicators in such research evaluation exercises is their low correlation at the article level with peer review judgments. In this study, we analyse 191,000 papers from 154 higher education institutes which were peer reviewed in a national research evaluation exercise. We combine these data with 6.95 million citations to the original papers. We show that when citation-based indicators are applied at the institutional or departmental level, rather than at the level of individual papers, surprisingly large correlations with peer review judgments can be observed, up to for some disciplines. In our evaluation of ranking prediction performance based on citation data, we show we can reduce the mean rank prediction error by 25% compared to previous work. This suggests that citation-based indicators are sufficiently aligned with peer review results at the institutional level to be used to lessen the overall burden of peer review on national evaluation exercises leading to considerable cost savings.
Although altmetrics and other web-based alternative indicators are now commonplace in publishers' websites, they can be difficult for research evaluators to use because of the time or expense of the data, the need to benchmark in order to assess their values, the high proportion of zeros in some alternative indicators, and the time taken to calculate multiple complex indicators. These problems are addressed here by (a) a field normalisation formula, the Mean Normalised Log-transformed Citation Score (MNLCS) that allows simple confidence limits to be calculated and is similar to a proposal of Lundberg, (b) field normalisation formulae for the proportion of cited articles in a set, the Equalised Mean-based Normalised Proportion Cited (EMNPC) and the Mean-based Normalised Proportion Cited (MNPC), to deal with mostly uncited data sets, (c) a sampling strategy to minimise data collection costs, and (d) free unified software to gather the raw data, implement the sampling strategy, and calculate the indicator formulae and confidence limits. The approach is demonstrated (but not fully tested) by comparing the Scopus citations, Mendeley readers and Wikipedia mentions of research funded by Wellcome, NIH, and MRC in three large fields for 2013-2016. Within the results, statistically significant differences in both citation counts and Mendeley reader counts were found even for sets of articles that were less than six months old. Mendeley reader counts were more precise than Scopus citations for the most recent articles and all three funders could be demonstrated to have an impact in Wikipedia that was significantly above the world average.
During the Italian research assessment exercise, the national agency ANVUR performed an experiment to assess agreement between grades attributed to journal articles by informed peer review (IR) and by bibliometrics. A sample of articles was evaluated by using both methods and agreement was analyzed by weighted Cohen’s kappas. ANVUR presented results as indicating an overall “good” or “more than adequate” agreement. This paper re-examines the experiment results according to the available statistical guidelines for interpreting kappa values, by showing that the degree of agreement (always in the range 0.09–0.42) has to be interpreted, for all research fields, as unacceptable, poor or, in a few cases, as, at most, fair. The only notable exception, confirmed also by a statistical meta-analysis, was a moderate agreement for economics and statistics (Area 13) and its sub-fields. We show that the experiment protocol adopted in Area 13 was substantially modified with respect to all the other research fields, to the point that results for economics and statistics have to be considered as fatally flawed. The evidence of a poor agreement supports the conclusion that IR and bibliometrics do not produce similar results, and that the adoption of both methods in the Italian research assessment possibly introduced systematic and unknown biases in its final results. The conclusion reached by ANVUR must be reversed: the available evidence does not justify at all the joint use of IR and bibliometrics within the same research assessment exercise.
This paper reports a citation analysis of the 1989 and 1990 publications of seven library or information studies departments in the UK. The total number of citations, the mean number of citations per member of staff, and the mean number of citations per publication were all strongly correlated (p<0.005), and the total number of publications less strongly correlated (p <0.05), with the ratings that these departments achieved in the last Research Assessment Exercise. An analysis of the citedness of different types of research out put from these departments suggests that conference papers and articles in professional journals attract noticeably fewer citations than other types of output, and that articles in scientific journals attract noticeably more citations than articles in social science journals.
On the basis of investigating author's opinion on citing motivations of chemistry papers aquasi-quantitative model for citing is suggested. The model selects professional and nonprofessional motivations of citing and introduces thecitation threshold concept which tries to characterize the effect of citing motivations quantitatively. Possible reasons for missing citations are also treated. Mean ages of real and of self-citations were calculated by subtracting the average of the publication years of cited papers from the publication year of the citing publication. The difference between the mean ages may characterize thesynchronity of the author's research in comparison with those working on similar topics. The paper introduces thecitation strategy indicator which relates impact factors of cited periodicals with the mean impact factor of periodicals in the corresponding research subfield.
A citation study was carried out on all 217 academics who teach in UK library and information science schools. These authors between them received 622 citations in Social Scisearch for articles they had published between 1988 and the present. The results were ranked by department, and compared to the ratings awarded to the departments in the 1992 Universities Funding Council Research Assessment Exercise. Using the Spearman Rank Order Correlation coefficient, it was found that there is a statistically significant correlation between the numbers of citations received by a department in total, or the average number of citations received in the department per academic, and the Research Assessment Exercise rating. The paper concludes that this provides further independent support for the validity of citation counting, even when using just the first authors as a search tool for cited references. The paper also concludes that the cost and effort of the Research Assessment Exercise may not be justified when a simpler and cheaper alternative, namely a citation counting exercise, could be undertaken. The paper also concludes that the University of North London would probably have benefitted from being included in the 1992 Research Assessment Exercise.
A study was carried out to assess the correlation between scores achieved by academic departments in the UK in the 1992 Research Assessment Exercise, and the number of citations received by academics in those departments for articles published in the period 1988±1992, using the Institute for Scientific Information’s citation databases. Only those papers first authored by academics identified from the Commonwealth Universities Yearbook were examined. Three subject areas: Anatomy, Genetics and Archaeology were chosen to complement Library and Information Management that had already been the subject of such a study. It was found that in all three cases, there is a statistically significant correlation between the total number of citations received, or the average number of citations per member of staff, and the Research Assessment Exercise score. Surprisingly, the strongest correlation was found in Archaeology, a subject noted for its heavy emphasis on monographic literature and with relatively low citation counts. The results make it clear that citation counting provides a robust and reliable indicator of the research performance of UK academic departments in a variety of disciplines, and the paper argues that for future Research Assessment Exercises, citation counting should be the primary, but not the only, means of calculating Research Assessment Exercise scores.
To determine influences on the production of a scientific article, the content of the article must be studied. We examined articles in biogeography and found that most of the influence is not cited, specific types of articles that are influential are cited while other types of that also are influential are not cited, and work that is “uncited” and “seldom cited” is used extensively. As a result, evaluative citation analysis should take uncited work into account.
In this paper first results are presented of a study on the correlation between bibliometric indicators and the outcomes of peer judgements made by expert committees of physics in the Netherlands. As a first step to study these outcomes in more detail, we focus on the results of an evaluation of 56 research programmes in condensed matter physics in the Netherlands, a subfield which accounts for roughly one third of the total of Dutch physics. This set of research programmes is represented by a volume of more than 5000 publications and nearly 50,000 citations. The study shows varying correlations between different bibliometric indicators and the outcomes of a peer evaluation procedure. Also a breakdown of correlations to the level of different peer review criteria has been made. We found that the peer review criterium `team' shows generally the strongest correlation with bibliometric indicators. Correlations prove to be higher for groups which are involved in basic science than for groups which are more application oriented.
The Leiden Ranking 2011/2012 is a ranking of universities based on bibliometric indicators of publication output, citation impact, and scientific collaboration. The ranking includes 500 major universities from 41 different countries. This paper provides an extensive discussion of the Leiden Ranking 2011/2012. The ranking is compared with other global university rankings, in particular the Academic Ranking of World Universities (commonly known as the Shanghai Ranking) and the Times Higher Education World University Rankings. Also, a detailed description is offered of the data collection methodology of the Leiden Ranking 2011/2012 and of the indicators used in the ranking. Various innovations in the Leiden Ranking 2011/2012 are presented. These innovations include (1) an indicator based on counting a university's highly cited publications, (2) indicators based on fractional rather than full counting of collaborative publications, (3) the possibility of excluding non-English language publications, and (4) the use of stability intervals. Finally, some comments are made on the interpretation of the ranking, and a number of limitations of the ranking are pointed out.
In December 2003, seventeen years after the first UK research assessment exercise, Italy started up its first-ever national research evaluation, with the aim to evaluate, using the peer review method, the excellence of the national research production. The evaluation involved 20 disciplinary areas, 102 research structures, 18,500 research products and 6,661 peer reviewers (1,465 from abroad); it had a direct cost of 3.55 millions Euros and a time length spanning over 18 months. The introduction of ratings based on ex post quality of output and not on ex ante respect for parameters and compliance is an important leap forward of the national research evaluation system toward meritocracy. From the bibliometric perspective, the national assessment offered the unprecedented opportunity to perform a large-scale comparison of peer review and bibliometric indicators for an important share of the Italian research production. The present investigation takes full advantage of this opportunity to test whether peer review judgements and (article and journal) bibliometric indicators are independent variables and, in the negative case, to measure the sign and strength of the association. Outcomes allow us to advocate the use of bibliometric evaluation, suitably integrated with expert review, for the forthcoming national assessment exercises, with the goal of shifting from the assessment of research excellence to the evaluation of average research performance without significant increase of expenses.
To identify the 100 top-cited articles ever published in rehabilitation journals and to analyze their characteristics as a quantitative approach to investigating the quality and evolution of rehabilitation research.
The Institute for Scientific Information Web of Knowledge Database and the 2007 and 2008 Journal Citation Report Science Editions were used to retrieve the 100 top-cited articles from 30 rehabilitation dedicated journals.
The 100 top-cited articles included randomized controlled trials, case-control studies, case series studies, case reports, methodologic studies, systematic reviews, narrative reviews, and expert opinions.
Two independent reviewers performed data extraction from the retrieved articles and compared their results. The Sackett's initial rules of evidence were used to categorize the type of study design as well as to evaluate the level of evidence provided by the results of the 100 top-cited articles.
Among the 45,700 articles published in these journals, the 100 top-cited articles were published between 1959 and 2002 with an average of 200 citations an article (range, 131-1109). Top-cited articles were all English-language, primarily from North America (United States=67%; Canada=11%) and published in 11 journals led by the Archives of Physical Medicine and Rehabilitation. Eighty-four percent of the articles were original publications and were most commonly prospective (76%) case series studies (67%) that used human subjects (96%) providing level 4 evidence. Neurorehabilitation (41%), disability (19%), and biomechanics (18%) were the most common fields of study.
We demonstrated that methodologic observational studies performed in North America and published in English have had the highest citations in rehabilitation journals.
We counted the citations received in one year (1998) by each staff member in each of 38 university psychology departments in the United Kingdom. We then averaged these counts across individuals within each department and correlated the averages with the Research Assessment Exercise (RAE) grades awarded to the same departments in 1996 and 2001. The correlations were extremely high (up to +0.91). This suggests that whatever the merits and demerits of the RAE process and citation counting as methods of evaluating research quality, the two approaches measure broadly the same thing. Since citation counting is both more cost-effective and more transparent than the present system and gives similar results, there is a prima facie case for incorporating citation counts into the process, either alone or in conjunction with other measures. Some of the limitations of citation counting are discussed and some methods for minimising these are proposed. Many of the factors that dictate caution in judging individuals by their citations tend to average out when whole departments are compared.
Introduction. This study aimed to explore research assessment within the field of music and, specifically, to investigate whether citation counting could be used to replace or inform the peer review system currently in use in the UK. Method. A citation analysis of academics submitted for peer review in Unit of Assessment 67 in the 2001 Research Assessment Exercise was performed using the Arts and Humanities Citation Index and checked for correlations with the Assessment scores. A Spearman rank order correlation coefficient test was used to assess the significance of correlations between citations and scores. Results. At a departmental level, citation counts correlated strongly with scores awarded by the Assessment Exercise. A weaker correlation was found between scores and individual counts. The correlations were significant at the 0.01% level. Types of submission were analysed and trends were found within the author group. However, the Arts and Humanities Citation Index was found to be unrepresentative of music research activity in UK universities due to its choice of source material. Conclusion. The Arts and Humanities Citation Index alone is not a suitable data source for citation analysis in the field of music. However, if an alternative data source could be found, there is potential for the use of citation analysis in research assessment in music.
Evaluations of research quality in universities are now widely used in the advanced economies. The UK's Research Assessment Exercise (RAE) is the most highly developed of these research evaluations. This article uses the results from the 2001 RAE in political science to assess the utility of citations as a measure of outcome, relative to other possible indicators. The data come from the 4,400 submissions to the RAE political science panel. The 28,128 citations analysed relate not only to journal articles, but to all submitted publications – including authored and edited books and book chapters. The results show that citations are the most important predictor of the RAE outcome, followed by whether or not a department had a representative on the RAE panel. The results highlight the need to develop robust quantitative indicators to evaluate research quality which would obviate the need for a peer evaluation based on a large committee. Bibliometrics should form the main component of such a portfolio of quantitative indicators.
This account of the Matthew effect is another small exercise in the psychosociological analysis of the workings of science as a social institution. The initial problem is transformed by a shift in theoretical perspective. As originally identified, the Matthew effect was construed in terms of enhancement of the position of already eminent scientists who are given disproportionate credit in cases of collaboration or of independent multiple discoveries. Its significance was thus confined to its implications for the reward system of science. By shifting the angle of vision, we note other possible kinds of consequences, this time for the communication system of science. The Matthew effect may serve to heighten the visibility of contributions to science by scientists of acknowledged standing and to reduce the visibility of contributions by authors who are less well known. We examine the psychosocial conditions and mechanisms underlying this effect and find a correlation between the redundancy function of multiple discoveries and the focalizing function of eminent men of science-a function which is reinforced by the great value these men place upon finding basic problems and by their self-assurance. This self-assurance, which is partly inherent, partly the result of experiences and associations in creative scientific environments, and partly a result of later social validation of their position, encourages them to search out risky but important problems and to highlight the results of their inquiry. A macrosocial version of the Matthew principle is apparently involved in those processes of social selection that currently lead to the concentration of scientific resources and talent (50).
The agreement on reforming research assessment
Coara
The Metric Tide: Correlation analysis of REF2014 scores and metrics (Supplementary Report II to the independent Review of the Role of Metrics in Research Assessment and Management)
Hefce
HEFCE (2015). The Metric Tide: Correlation analysis of REF2014 scores and metrics
(Supplementary Report II to the independent Review of the Role of Metrics in
Research Assessment and Management). Higher Education Funding Council for
England. https://www.ukri.org/publications/review-of-metrics-in-researchassessment-and-management/
How research England supports research excellence
Ukri
Can REF output quality scores be assigned by AI? Experimental evidence
Jan 2022
Thelwall M.
Scholarly communication and bibliometrics. Annual review of information science and technology
Jan 2002
1-53
C L Borgman
J Furner
Borgman, C. L., & Furner, J. (2002). Scholarly communication and bibliometrics. Annual
review of information science and technology, 36(1), 1-53.
Citation counts: Are they good predictors of RAE scores? London: Advanced Institute of Management Research
Jan 2008
S Mahdi
P D'este
A Neely
Mahdi, S., D'Este, P. & Neely, A. (2008). Citation counts: Are they good predictors of RAE
scores? London: Advanced Institute of Management Research.
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1154053
Top-cited articles in rehabilitation. Archives of physical medicine and rehabilitation
Jan 2010
806-815
B Shadgan
M Roig
B Hajghanbari
W D Reid
Shadgan, B., Roig, M., HajGhanbari, B., & Reid, W. D. (2010). Top-cited articles in
rehabilitation. Archives of physical medicine and rehabilitation, 91(5), 806-815.
Can REF output quality scores be assigned by AI? Experimental evidence
Jan 2022
M Thelwall
K Kousha
M Abdoli
E Stuart
M Makita
P Wilson
J Levitt
Thelwall, M., Kousha, K., Abdoli, M., Stuart, E., Makita, M., Wilson, P., & Levitt, J. (2022). Can
REF output quality scores be assigned by AI? Experimental evidence. arXiv preprint
arXiv:2212.08041.