PreprintPDF Available

Abstract and Figures

Scientists are increasingly overwhelmed by the volume of articles being published. Total articles indexed in Scopus and Web of Science have grown exponentially in recent years; in 2022 the article total was ~47% higher than in 2016, which has outpaced the limited growth-if any-in the number of practising scientists. Thus, publication workload per scientist (writing, reviewing, editing) has increased dramatically. We define this problem as "the strain on scientific publishing." To analyse this strain, we present five data-driven metrics showing publisher growth, processing times, and citation behaviours. We draw these data from web scrapes, requests for data from publishers, and material that is freely available through publisher websites. Our findings are based on millions of papers produced by leading academic publishers. We find specific groups have disproportionately grown in their articles published per year, contributing to this strain. Some publishers enabled this growth by adopting a strategy of hosting "special issues," which publish articles with reduced turnaround times. Given pressures on researchers to "publish or perish" to be competitive for funding applications, this strain was likely amplified by these offers to publish more articles. We also observed widespread year-over-year inflation of journal impact factors coinciding with this strain, which risks confusing quality signals. Such exponential growth cannot be sustained. The metrics we define here should enable this evolving conversation to reach actionable solutions to address the strain on scientific publishing.
Content may be subject to copyright.
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 1
The strain on scientific publishing
Mark A. Hanson1, Pablo Gómez Barreiro2, Paolo Crosetto3, Dan Brockington4
Author correspondence:
MAH (m.hanson@exeter.ac.uk, ORCID: https://orcid.org/0000-0002-6125-3672)
PGB (p.gomez@kew.org, ORCID: https://orcid.org/0000-0002-3140-3326)
PC (paolo.crosetto@inrae.fr, ORCID: https://orcid.org/0000-0002-9153-0159)
DB (Daniel.Brockington@uab.cat, ORCID: https://orcid.org/0000-0001-5692-0154)
1. Centre for Ecology and Conservation, Faculty of Environment, Science and
Economy, University of Exeter, Penryn, TR10 9FE, United Kingdom
2. Royal Botanic Gardens, Kew, Wakehurst, Ardingly, West Sussex RH17 6TN, United
Kingdom
3. Univ. Grenoble Alpes, INRAE, CNRS, Grenoble INP, GAEL, Grenoble 38000, France
4. Institut de Ciència i Tecnologia Ambientals (ICTA), Universitat Autònoma de
Barcelona & ICREA, Pg. Lluís Companys 23, Barcelona, Spain
Abstract
Scientists are increasingly overwhelmed by the volume of articles being published. Total
articles indexed in Scopus and Web of Science have grown exponentially in recent years; in
2022 the article total was ~47% higher than in 2016, which has outpaced the limited growth
if any in the number of practising scientists. Thus, publication workload per scientist (writing,
reviewing, editing) has increased dramatically. We define this problem as “the strain on
scientific publishing.” To analyse this strain, we present five data-driven metrics showing
publisher growth, processing times, and citation behaviours. We draw these data from web
scrapes, requests for data from publishers, and material that is freely available through
publisher websites. Our findings are based on millions of papers produced by leading
academic publishers. We find specific groups have disproportionately grown in their articles
published per year, contributing to this strain. Some publishers enabled this growth by
adopting a strategy of hosting “special issues,” which publish articles with reduced turnaround
times. Given pressures on researchers to “publish or perish” to be competitive for funding
applications, this strain was likely amplified by these offers to publish more articles. We also
observed widespread year-over-year inflation of journal impact factors coinciding with this
strain, which risks confusing quality signals. Such exponential growth cannot be sustained.
The metrics we define here should enable this evolving conversation to reach actionable
solutions to address the strain on scientific publishing.
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 2
Introduction
Academic publishing has a problem. The last few years have seen an exponential growth in
the number of peer-reviewed journal articles, which has not been matched by the training of
new researchers who can vet those articles (Fig. 1A). Editors are reporting difficulties in
recruiting qualified peer reviewers (1, 2), and scientists are overwhelmed by the immense total
of new articles being published (3, 4). We will call this problem “the strain on scientific
publishing.”
Part of this growth may come from inclusivity initiatives or investment in the Global South,
which make publishing accessible to more researchers (5, 6). Parallel efforts have also
appeared in recent years to combat systemic biases in scientific publishing (79), including
positive-result bias (10). If this strain on scientific publishing comes from such initiatives, it
would be welcome and should be accommodated.
However, this strain may compromise the ability of scientists to be rigorous when vetting
information (11). If scientific rigour is allowed to slip, it devalues the term “science” (12). Recent
controversies already demonstrate this threat, as research paper mills operating within
publishing groups have caused mass article retractions (1315), alongside renewed calls to
address so-called “predatory publishing” (16).
To understand the forces that contribute to this strain, we first present a simple schematic to
describe scientific publishing. We then specifically analyse publishers, as their infrastructures
regulate the rate at which growth in published articles can occur. To do this, we identify five
key metrics that help us to understand the constitution and origins of this strain: growth in total
articles and special issues, differences in article turnaround times or rejection rates, and a new
metric informing on journal quality that we call “impact inflation.”
These metrics should be viewed in light of publisher business models. First, there is the more
classic subscription-based model generating revenue from readers. Second, there is the “gold
open access” model, which generates revenue through article processing charges that
authors pay instead. In both cases publishers can act either as for-profit or not-for-profit
organisations. We therefore consider if aspects of either of these business models are
contributing to the strain.
Here we provide a comparative analysis, combining multiple metrics, to reveal what has
generated the strain on scientific publishing. We find strain is not strictly tied to any one
publisher business model, although some behaviours are associated with specific gold open
access publishers. We argue that existing efforts to address this strain are insufficient. We
highlight specific areas needing transparency, and actions that publishers, researchers, and
funders can take to respond to this strain. Our study provides the essential data to inform the
existing conversation on academic publishing practices.
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 3
Framework and Methods
The love triangle of scientific publishing: a conceptual framework
The strain on scientific publishing is the result of interactions between three sets of players:
publishers, researchers, and funders.
Publishers want to publish as many papers as possible, subject to a quality constraint. They
give researchers “publication”, i.e. a “badge of quality” that researchers use for their own goals.
The quality of a badge is often determined by journal-level prestige metrics, such as the
Clarivate journal Impact Factor (IF), or Scopus Scimago Journal Rank (SJR) (17, 18), and
ultimately by association with the quality of published papers. Publishers compete with each
other to attract the most and/or the best papers.
Researchers want to publish as many papers in prestigious journals as possible, subject to an
effort constraint. They do so because publications and citations are key to employment,
promotion, and funding: so called “publish or perish” (12, 19). Researchers act as authors that
generate articles, but can also be referees and editors that consult for publishers and funders
for free. In exchange, they gain influence over administering publisher badges of quality and
who gets limited jobs or funding. More altruistically, they help ensure the quality of science in
their field.
Funders (e.g. universities, funding agencies) use “badges” from the science publication
market as measures of quality to guide their decisions on whom to hire and fund (20, 21); in
some countries, journal badges directly determine promotion or salary (e.g. (22)). Ultimately,
money from funders supports the whole market, and funders want cost-effective and
informative signals to help guide their decisions.
The incentives for publishers and researchers to increase their output drives growth. This is
not problematic per se, but it should not come at the expense of research quality. The difficulty
is that “quality” is hard to define (17, 18, 23), and some metrics are at risk of abuse per
Goodhart’s law: “when a measure becomes a target, it ceases to be a good measure” (24).
For instance, having many citations can indicate an author, article, or journal, is having an
impact. But citations can be gamed through self-citing or coordinated “citation cartels” (25,
26).
Collectively, the push and pull by the motivations of these players defines the sum product of
the scientific publishing industry.
Data collection and analysis
A full summary of our data methodology is given in the supplementary materials and methods.
In brief: we produced five metrics of publisher practice that describe the total volume of
material being published, or that affect the quality of publisher “badges”. We focused our
analyses on the last decade of publication growth, with special attention paid to the period of
2016-2022, as pre-2016, some data types were less available. We used the Scopus database
(via Scimago (27)) filtered for journals indexed in both Scopus and Web of Science. We further
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 4
assembled journal/article data by scraping information in the public domain from web pages,
and/or following direct request to publishers. These metrics are:
Total articles indexed in both Scopus and Web of Science
Share of articles appearing in special issues
Article turnaround times from submission to acceptance
Journal rejection rates as defined by publishers
A new metric we call “impact inflation,” informed by journal citation behaviours
Due to limits in web scraping data availability, for special issue proportions, turnaround times,
and rejection rates, we focused on only a subset of publishers and articles (Table 1). Further,
due to copyright concerns over our web scraping of information in the public domain, we have
been legally advised to forego a public release of our data and scripts at this time, but will
make these available for formal peer review. High resolution versions of the figures can be
found at doi: 10.6084/m9.figshare.24203790.
Table 1: summary of web scraped data
informing share of special issue
articles and turnaround times. For some
publishers, the number of web scraped
journals or articles with turnaround time
data exceeds the totals from our Scimago
dataset (noted with *). This is because, in
this second dataset, we included all
journals by a given publisher, even if they
were not indexed, or indexed by only one
of Scopus or Web of Science.
Results
A few publishers disproportionately contribute to total article growth
There were ~896k more indexed articles per year in 2022 (~2.82m articles) compared to 2016
(~1.92m articles) (Fig. 1A), a year-on-year growth of ~5.6% over this time period. To
understand the source of this substantial growth, we first divided article output across
publishers per Scopus publisher labels (Fig. 1B). The five largest publishers by total article
output include Elsevier, Multidisciplinary Publishing Institute (MDPI), Wiley-Blackwell (Wiley),
Springer, and Frontiers Media (Frontiers) respectively. However, in terms of strain added since
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 5
Figure 1: Total article
output is increasing.
A) Total articles being
published per year has
increased exponentially,
while PhDs being
awarded have not kept
up. This remains true
with addition of non-
OECD countries, or
when using global total
employed researcher-
hours instead of PhD
graduates as a proxy for
active researchers (Fig.
1supp1). B-C) Total
articles per year by
publisher (B), or per
journal per year by
publisher (C). Also see
growth in journals per
publisher (Fig. 1supp2)
and by size class (Fig.
1supp3).
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 6
Figure 2: rise of the special issue model of publishing. Normal articles (blue) and special issue
articles (red) over time. Frontiers, Hindawi, and especially MDPI publish a majority of their articles
through special issues, including an increase in recent years alongside growth seen in Fig. 1 (detailed
further in Fig. 2supp1,2). These data reflect only a fraction of total articles shown in Fig. 1, limited due
to sampling methodology (Table 1).
2016, their rank order changes: journals from MDPI (~27%), Elsevier (~16%), Frontiers
(~11%), Springer (~9.5%), and Wiley (~6.8%) have contributed >70% of the increase in
articles per year. Elsevier and Springer own a huge proportion of total journals, a number that
has also increased over the past decade (Figure 1supp2). As such, we normalised article
output per journal to decouple the immensity of groups like Elsevier and Springer from the
growth of articles itself. While Elsevier has increased article outputs per journal slightly, other
groups such as MDPI and Frontiers, have become disproportionately high producers of
published articles per journal (Fig. 1C).
Taken together, groups like Elsevier and Springer have quantitatively increased total article
output by distributing articles across an increasing number of journals. Meanwhile groups like
MDPI and Frontiers have been exponentially increasing the number of publications handled
by a much smaller pool of journals. These publishers reflect two different mechanisms that
have promoted the exponential increase in total articles published over the last few years.
Growth in articles published through “special issues”
“Special issues” are distinct from standard articles because they are invited by journals or
editors, rather than submitted independently by authors. They also delegate responsibilities to
guest editors, whereas editors for normal issues are formal staff of the publisher. In recent
years, certain publishers have adopted this business model as a route to publish the majority
of their articles (Fig. 2). This behaviour encourages researchers to generate articles
specifically for special issues, raising concerns that publishers could abuse this model for profit
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 7
(28). Here we describe this growth in special issues for eight publishers for which we could
collect data.
Between 2016 and 2022, the proportion of special issue articles grew drastically for Hindawi,
Frontiers, and MDPI (Fig. 2supp1,2). These publishers depend on article processing charges
for their revenues, which are paid by authors to secure gold open access licences. But this
special issue growth is not a necessary feature of open access publishing as similar changes
were not seen in other gold open access publishers (i.e. BMC, PLOS). Publishers using both
subscription and open access approaches (Nature, Springer, Wiley) also tended to publish
small proportions of special issues.
These data show that the strain generated by special issues is not a direct consequence of
the rise of open access publishing per se, or associated article processing charges. Instead,
the dominance of special issues in a publisher’s business model is publisher-dependent.
Decreasing mean, increasing homogeneity of turnaround times
We define article turnaround times as the time taken from submission to acceptance. The peer
review process can take weeks to months depending on field of research and the magnitude
or type of revisions required, meaning turnaround times across articles and journals are
expected to vary. Turnaround times also reflect a trade-off between rigour and efficiency:
longer timeframes can allow greater rigour, but they delay publication. Shorter timeframes
could reflect greater efficiency, but rushing of timeframes could make mistakes more likely.
Given these considerations, there should be an objective, reasonable, minimum and
maximum turnaround time needed to conduct appropriate peer review. Moreover, a journal
performing rigorous peer review should have heterogenous turnaround times if each article is
considered and addressed according to its unique needs.
We analysed turnaround times between 2016 and 2022 for publications where data were
available. We found that average turnaround times vary markedly across publishers. Like
others (29, 30), we found that MDPI had an average turnaround time of ~37 days from first
submission to acceptance in 2022, a level they have held at since ~2018. This turnaround
time is far lower than comparable publishers like Frontiers (72 days) and Hindawi (83 days),
which also saw a decline in mean turnaround time between 2020 and 2022. On the other
hand, other publishers in our dataset had turnaround times of >130 days, and if anything, their
turnaround times increased slightly between 2016-2022 (Fig. 3A).
The publishers decreasing their turnaround times also show declining variances. Turnaround
times for Hindawi, Frontiers, and especially MDPI are becoming increasingly homogenous
(Fig. 3B and 3supp1). This implies these articles, regardless of initial quality or field of
research, and despite the expectation of heterogeneity, are all accepted in an increasingly
similar timeframe.
The decrease in mean turnaround times (Fig. 3A) also aligns with inflection points for the
exponential growth of articles published as part of special issues in Hindawi (2020), Frontiers
(2019), and MDPI (2016) (see Fig. 2supp1). We therefore asked if special issue articles are
processed more rapidly than normal articles in general. For most publishers, this was indeed
the case, even independent of proportions of normal and special issue articles (Fig. 3supp2).
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 8
Figure 3: Article turnaround times. A) Evolution of mean turnaround times by publisher. Only articles
with turnaround times between 1 day and 2 years were included. This filter was applied to remove data
anomalies such as immediate acceptance or missing values that default to Jan 1st 1970 (the “Unix
epoch”). B) Article turnaround time distribution curves from 2016-2022, focused on the first six months
to better show trends. While most publishers have a right-skewed curve, the three publishers highlighted
previously for increased special issue use have a left-skewed curve that only became more extreme
over time. These data reflect only a fraction of total articles shown in Fig. 1, limited due to sampling
methodology (see Table 1). Tay. & Fran. = Taylor & Francis.
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 9
Here we find that turnaround times differ by publisher, associated with use of the special issue
publishing model. Variance in turnaround times also decreases for publishers alongside
adoption of the special issue model. These results suggest that special issue articles are
typically accepted more rapidly and in more homogenous timeframes than normal articles,
which, to our knowledge, has never been formally described.
Journal rejection rates and trends are publisher-specific
If a publisher lowers their article rejection rates, all else being equal, it will lead to more articles
being published. Such changes to rejection rate might also mean more lower-quality articles
are being published. Peer review is the principal method of quality control that defines science
(31), and so publishing more articles with lower quality may add to strain and detract from the
meaning and authority of the scientific process.
The relationship between rejection and quality is complex. High rejection rates do not
necessarily reflect greater rigour: rigorous science can be rejected if the editors think that the
findings lack the scope required for their journal. Conversely, low rejection rates may reflect a
willingness to publish rigorous science independent of scope. The publisher also defines what
“rejection rate” means in-house, creating caveats for comparing raw numbers across
publishers.
Rejection rate data are rarely made public, and only a minority of publishers provide these
data routinely or shared rejection rates upon request. Using the rejection rate data we could
collect, we estimated rejection rates per publisher and asked if they: 1) change with growth in
articles, 2) correlate with journal size, 3) predict article turnaround times, 4) correlate with
journal impact, 5) depend on the publisher, or 6) predict a journal’s proportion of species issue
articles.
We found no clear trend between the evolution of rejection rates and publisher growth (Fig.
4A). Focussing on younger journals (≤10 years, ensuring fair comparisons) we found no
relationship between journal size and reported or calculated 2022 rejection rates (Fig. 4B).
Turnaround times are also not a strong predictor of rejection rates (Fig. 4supp1A). Finally,
citations per document (similar to Clarivate IF) did not correlate with rejection rates (Fig.
4supp1B), indicating citations are not a strong predictor. Ultimately, the factor that best
predicted rejection rates was the publisher itself: although both Frontiers and MDPI have
similar growth in special issue articles (Fig. 2), they show opposite trends in rejection rates
over time, and MDPI uniquely showed decreasing rates compared publishers (Fig. 4A). Raw
rejection rates for MDPI in 2022 were also lower than other publishers. Curiously, Hindawi and
MDPI journals with more special issue articles also had lower rejection rates (P = 5.5e-8 and
P = .01 respectively, Fig. 4supp2), which we could not assess for other publishers.
In summary we found no general associations across publishers between rejection rate and
most other metrics we investigated. Over time or among journals of similar age, rejection rate
patterns were largely publisher-specific. We did, however, recover a trend that within
publishers, rejection rates decline with increased use of special issue publishing.
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 10
Figure 4: rejection rates are defined most specifically by publisher. A) Rejection rates differ
markedly across publisher, including trends of increase, decrease, or no change. We estimated
publisher rejection rates from varying available data, so we normalised these data by setting the first
year on record as “100.” Within publisher, we assume underlying data and definitions of ‘rejection’ are
consistent from year to year, allowing comparisons among trends themselves. Frontiers data are the
aggregate of all Frontiers journals, preventing the plotting of 95% confidence intervals. B) 2022 rejection
rates among young journals (<10 years old) differ by publisher, but not journal size (B) or other metrics
(Fig. 4supp1).
Disproportionately inflated Impact Factor affects select publishers
Among the most important metrics of researcher impact and publisher reputation are citations.
For journals, the Clarivate 2-year IF reflects the mean citations per article in the two preceding
years. Here we found that IF has increased across publishers in recent years (Fig. 5supp1,2).
Explaining part of this IF inflation, we observed an exponential increase in total references per
document between 2018-2021 (Fig. 5supp3, and see (30)). However, we previously noted that
IF is used as a “badge of quality” by both researchers and publishers to earn prestige, and
that IF can be abused by patterns of self-citation. We therefore asked if changes in journal
citation behaviour may have contributed to recent inflation of the IF metric.
To enable systematic analysis, we used Cites/Doc from the Scimago database as a proxy of
Clarivate IF (Cites/Doc vs. IF: R2 = 0.77, Fig. 5supp4A). We then compared Cites/Doc to the
network-based metric “Scimago Journal Rank” (SJR). Precise details of these metrics are
discussed in the supplementary methods (Supplementary Table 1 and see (18)). A key
difference between SJR and Cites/Doc is that SJR has a maximum amount of ‘prestige’ that
can be earned from a single source. As such, within-journal self-citations or citation cartel-like
behaviour is rewarded in Cites/Doc and IF, but not SJR. We define the ratio of Cites/Doc to
SJR (or IF to SJR) as “impact inflation.
Impact inflation differs dramatically across publishers (Fig. 5A), and has also increased across
publishers over the last few years (Fig. 5supp5A). In 2022, impact inflation in MDPI and
Hindawi were significantly higher than all other publishers (Padj < .05). Interestingly, Frontiers
had low impact inflation comparable to other publishers, despite growth patterns similar to
MDPI and Hindawi.
The reason behind MDPI’s anomalous impact inflation appears to be straightforward: MDPI
journals nearly universally spiked in rates of within-journal self-citation during the study period
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 11
Figure 5: Changing behaviour of citation metrics revealed by Impact Inflation. Statistical letter
groups reflect differences in one-way ANOVA with Tukey HSD. A) MDPI and Hindawi have significantly
higher impact inflation compared to all other publishers. Comparisons using samples of Clarivate IFs
are shown in Fig. 5supp4. B) MDPI journals have the highest rate of within-journal self-citation among
compared publishers, including in previous years (Fig. 5supp5,6). Here we specifically analyse journals
receiving at least 1000 citations per year to avoid comparing young or niche journals to larger ones
expected to have diverse citation profiles.
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 12
(Fig. 5supp5B), with significant differences in self-citation rates compared to other publishers
(Fig. 5B, Padj < .05 and MDPI vs. Taylor & Francis, Padj = .13), including comparisons in
previous years (5supp6, Padj 2021 < .05, and MDPI vs. Taylor & Francis Padj 2021 =3e-7). Indeed,
beyond within-journal self-citations, in an analysis from 2021, MDPI journals received ~29%
of their citations from other MDPI journals (31), which would be rewarded per citation for IF
but not SJR. Notably, Hindawi had self-citation rates more comparable to other publishers
(Fig. 5B, Fig. 5supp6), despite high impact inflation. In this regard, while Hindawi journals may
not directly cite themselves as often, they may receive many citations from a small network of
journals, including many citations from MDPI journals (example in Fig. 5supp7).
In summary, we provide a novel metric, “impact inflation,” that uses publicly-available data to
assess journal citation behaviours. Impact inflation describes how proportionate a journal’s
total citations are compared to a network-adjusted approach. In the case of MDPI, there was
also a high prevalence of within-journal self-citation, consistent with reports by Oviedo-Garcia
(32) and MDPI itself (31). However high impact inflation and self-citation is not strictly
correlated with other metrics we have investigated.
Discussion
Here we have characterised the strain on scientific publishing, as measured by the exponential
rise of indexed articles and the resulting inability of scientists to keep up with them. The
collective addition of nearly one million articles per year over the last 6 years alone costs the
research community immensely, both in writing and reviewing time and in fees and article
processing charges. Further, given our strict focus on indexed articles, not total articles, our
data likely underestimate the true extent of the strain the problem is even worse than we
describe.
The strain we characterise is a complicated problem, generated by the interplay of different
actors in the publishing market. Funders want to get the best return on their investment, while
researchers want to prove they are a good investment. The rise in scientific article output is
only possible with the participation of researchers who act as authors, reviewers and editors.
Researchers do this because of the “publish or perish” imperative (19), which rewards
individual researchers who publish as much as possible, forsaking quality for quantity. On the
other hand, publishers host and spur the system’s growth in their drive to run a successful
business. Publishers structure the market, control journal reputation, and as such are focal
players which has led to concerns regarding to what extent publisher behaviour is motivated
by profit (28). Growth in published papers should be possible and could be welcome. However,
in the business of science publishing, growth should never come at the cost of the scientific
process.
Considering our metrics in combination (Table 2) also allows us to identify common trends
and helps to characterise the role that different publishers play in generating the strain. Across
publishers, article growth is the norm, with some groups contributing more than others. Impact
factors and impact inflation have both increased universally, exposing the extent to which the
publishing system itself has succumbed to Goodhart’s law. Nonetheless, the vast majority of
growth in total indexed articles has come from just a few publishing houses following two broad
models.
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 13
Table 2: Strain indicators from 2016 to 2022. Data on total articles and impact inflation drawn from
the Scimago dataset. Data on special issues, turnaround times, and rejection rates come from web
scrapes limited to the publishers shown. Rejection rate change for Elsevier and Hindawi start from
2018 and 2020 respectively. pp = ‘percentage points.’
For older publishing houses (e.g. Elsevier, Springer), growth was not driven by major growth
across all journals, but by the synergy of mild growth in both total journals and articles per
journal in tandem. Another strategy used only by certain for-profit, gold open access
publishers, consisted in an increased use of special issue articles as a primary means of
publishing. This trend was coupled with uniquely reduced turnaround times, and in specific
cases, high impact inflation and reduced rejection rates. Despite their stark differences, the
amount of strain generated through these two strategies is comparable.
The rich context provided by our metrics also provides unique insights. Ours is the first study,
of which we are aware, to document that special issue articles are systematically handled
differently from normal submissions: special issues have lower rejection rates, and also both
lower and seemingly more homogeneous turnaround times. We also highlight the unique view
one gets by considering different forms of citation metrics, and develop impact inflation
(IF/SJR) as a litmus test for journal reputation, informing not on journal impact itself, but rather
whether a journal’s impact is proportional to its expected rank absent the contribution of e.g.
citation cartels.
Throughout our study MDPI was an outlier in every metric often by wide margins. MDPI had
the largest growth of indexed articles (+1080%) and proportion of special issue articles (88%),
shortest turnaround times (37 days), decreasing rejection rates (-8 percentage points), highest
impact inflation (5.4), and the highest within-journal mean self-citation rate (9.5%). Ours is not
the first study analysing MDPI (13, 32, 33), but our broader context highlights the uniqueness
of their profile and of their contribution to the strain.
Some metrics appear to be principally driven by publisher’s policies: rejection rates and
turnaround time means and variances are largely independent from any other metric we
assayed. This raises questions about the balance between publisher’s oversight and scientific
editorial independence. This balance is essential to maintain scientific integrity and authority:
oversight should be sufficient to ensure rigorous standards, but not so invasive as to override
the independence of editors. Understanding how editorial independence is maintained in
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 14
current publishing environments, though beyond the scope of this paper, is key to maintaining
scientific integrity and authority.
Given the importance of scientific publishing, it is unfortunate that the basic data needed to
inform an evidence-based discussion are so hard to collect. This discussion on academic
publishing would be easier if the metrics we collected were more readily available we had to
web scrape to obtain many pieces of basic information. The availability of our metrics could
be encouraged by groups such as the Committee on Publication Ethics (34), which publishes
guides on principles of transparency. We would recommend transparency for: proportion of
articles published through special issues (or other collection headings), article turnaround
times, and rejection rates. Rejection rates in particular would benefit from an authority
providing a standardised reporting protocol, which would greatly boost the ability to draw
meaningful information from them. While not a metric we analysed, it also seems prudent for
publishers to be transparent about revenue and operating costs, given much of the funding
that supports the science publishing system comes from taxpayer-funded or non-profit entities.
Referees such as Clarivate should also be more transparent; their decisions can have a
significant impact on the quality of publisher badges (see Table 1supp1 and (35)), and yet the
reasoning behind these decisions is opaque.
Greater transparency will allow us to document the strain on scientific publishing more
effectively. However, it will not answer the fundamental question: how should this strain be
addressed? Addressing strain could take the form of grassroots efforts (e.g. researcher
boycotts) or authority actions (e.g. funder or committee directives, index delistings).
Researchers, though, are a disparate group and collective action is hard across multiple
disciplines, countries and institutions. In this regard, funders can change the publish or perish
dynamics for researchers, thus limiting their drive to supply articles. We recommend funders
to review the metrics we define here and adopt policies such as narrative CVs that highlight
researchers’ best work over total volume (36), which mitigate publish or perish pressures.
Indeed, researchers agree that changes to research culture must be principally driven by
funders (37), whose financial power could also help promote engagement with commendable
publishing practices.
Our study shows that regulating behaviours cannot be done at the level of publishing model.
Gold open access, for example, does not necessarily add to strain, as gold open access
publishers like PLOS (not-for-profit) and BMC (for-profit) show relatively normal metrics across
the board. Rather our findings suggest that addressing strain requires action be taken to
address specific publishers and specific behaviours. For instance, collective action by the
researcher community, or guidelines from funders or ethics commitees, could encourage
fewer articles be published through special issues, which our study suggests are not held to
the same standard as normal issues. Indeed, reducing special issue articles would already
address the plurality of strain being added.
Here we have characterised the strain on scientific publishing. We hope this analysis helps
advance the conversation among publishers, researchers, and funders to reduce this strain
and work towards a sustainable publishing infrastructure.
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 15
Acknowledgements
We thank the following publishers for providing data openly, or upon request: MDPI, Hindawi,
Frontiers, PLOS, Taylor & Francis, BMC and The Royal Society. We further thank many
colleagues and publishers for providing feedback on this manuscript prior to its public release:
Matthias Egger, Howard Browman, Kent Anderson, Erik Postma, Yuko Ulrich, Paul Kersey,
Gemma Derrick, Odile Hologne, Pierre Dupraz, Navin Ramankutty, and representatives from
the publishers MDPI, Frontiers, PLOS, Springer, Wiley, and Taylor & Francis. This work was
a labour of love, and was not externally funded.
Author contributions
Web scraping was performed by PGB and PC, and Scimago data curation by MAH. Global
doctorate and global researcher data curation was done by MAH and DB. Data analysis in R
was done by MAH, PGB, and PC. Conceptualisation was performed collectively by MAH,
PGB, PC, and DB. The initial article draft was written by MAH. All authors contributed to writing
and revising to produce the final manuscript.
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 16
References
1. C. W. Fox, A. Y. K. Albert, T. H. Vines, Recruitment of reviewers is becoming harder at
some journals: a test of the influence of reviewer fatigue at six journals in ecology and
evolution. Res Integr Peer Rev 2, 3, s41073-017-0027x (2017).
2. C. J. Peterson, C. Orticio, K. Nugent, The challenge of recruiting peer reviewers from
one medical journal’s perspective. Proc (Bayl Univ Med Cent) 35, 394396 (2022).
3. A. Severin, J. Chataway, Overburdening of peer reviewers: A multistakeholder
perspective on causes and effects. Learned Publishing 34, 537546 (2021).
4. P. D. B. Parolo, et al., Attention decay in science. Journal of Informetrics 9, 734745
(2015).
5. D. Maher, A. Aseffa, S. Kay, M. Tufet Bayona, External funding to strengthen capacity
for research in low-income and middle-income countries: exigence, excellence and
equity. BMJ Glob Health 5, e002212 (2020).
6. G. Nakamura, B. E. Soares, V. D. Pillar, J. A. F. Diniz-Filho, L. Duarte, Three pathways
to better recognize the expertise of Global South researchers. npj biodivers 2, 17
(2023).
7. , U.S. scientific leaders need to address structural racism, report urges (2023)
https:/doi.org/10.1126/science.adh1702 (May 6, 2023).
8. S.-N. C. Liu, S. E. V. Brown, I. E. Sabat, Patching the “leaky pipeline”: Interventions for
women of color faculty in STEM academia. Archives of Scientific Psychology 7, 3239
(2019).
9. E. Meijaard, M. Cardillo, E. M. Meijaard, H. P. Possingham, Geographic bias in citation
rates of conservation research: Geographic Bias in Citation Rates. Conservation
Biology 29, 920925 (2015).
10. A. Mlinarić, M. Horvat, V. Šupak Smolčić, Dealing with the positive publication bias:
Why you should really publish your negative results. Biochem Med (Zagreb) 27,
030201 (2017).
11. L. J. Hofseth, Getting rigorous with scientific rigor. Carcinogenesis 39, 2125 (2018).
12. D. Sarewitz, The pressure to publish pushes down quality. Nature 533, 147147
(2016).
13. A. Abalkina, Publication and collaboration anomalies in academic papers originating
from a paper mill: Evidence from a Russiabased paper mill. Learned Publishing,
leap.1574 (2023).
14. C. Candal-Pedreira, et al., Retracted papers originating from paper mills: cross
sectional study. BMJ, e071517 (2022).
15. H. Else, R. Van Noorden, The fight against fake-paper factories that churn out sham
science. Nature 591, 516519 (2021).
16. A. Grudniewicz, et al., Predatory journals: no definition, no defence. Nature 576, 210
212 (2019).
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 17
17. E. Garfield, The History and Meaning of the Journal Impact Factor. JAMA 295, 90
(2006).
18. V. P. Guerrero-Bote, F. Moya-Anegón, A further step forward in measuring journals’
scientific prestige: The SJR2 indicator. Journal of Informetrics 6, 674688 (2012).
19. D. R. Grimes, C. T. Bauch, J. P. A. Ioannidis, Modelling science trustworthiness under
publish or perish pressure. R. Soc. open sci. 5, 171511 (2018).
20. F. C. Fang, A. Casadevall, Research Funding: the Case for a Modified Lottery. mBio 7,
e00422-16 (2016).
21. D. Li, L. Agha, Big names or big ideas: Do peer-review panels select the best science
proposals? Science 348, 434438 (2015).
22. W. Quan, B. Chen, F. Shu, Publish or impoverish: An investigation of the monetary
reward system of science in China (1999-2016). AJIM 69, 486502 (2017).
23. M. Thelwall, et al., In which fields do higher impact journals publish higher quality
articles? Scientometrics 128, 39153933 (2023).
24. M. Fire, C. Guestrin, Over-optimization of academic publishing metrics: observing
Goodhart’s Law in action. GigaScience 8, giz053 (2019).
25. I. Fister, I. Fister, M. Perc, Toward the Discovery of Citation Cartels in Citation
Networks. Front. Phys. 4 (2016).
26. E. A. Fong, R. Patnayakuni, A. W. Wilhite, Accommodating coercion: Authors, editors,
and citations. Research Policy 52, 104754 (2023).
27. (n.d.) Scimago, SCImago Journal & Country Rank [Portal] (2023).
28. J. P. A. Ioannidis, A. M. Pezzullo, S. Boccia, The Rapid Growth of Mega-Journals:
Threats and Opportunities. JAMA 329, 1253 (2023).
29. D. W. Grainger, Peer review as professional responsibility: A quality control system
only as good as the participants. Biomaterials 28, 51995203 (2007).
30. B. D. Neff, J. D. Olden, Not So Fast: Inflation in Impact Factors Contributes to Apparent
Improvements in Journal Quality. BioScience 60, 455459 (2010).
31. MDPI, “Comment on: ‘Journal citation reports and the definition of a predatory journal:
The case of the Multidisciplinary Digital Publishing Institute (MDPI)’ from Oviedo-
García” (2021) (May 6, 2023).
32. M. Á. Oviedo-García, Journal citation reports and the definition of a predatory journal:
The case of the Multidisciplinary Digital Publishing Institute (MDPI). Research
Evaluation 30, 405419a (2021).
33. S. Copiello, On the skewness of journal selfcitations and publisher selfcitations: Cues
for discussion from a case study. Learned Publishing 32, 249258 (2019).
34. E. Wager, The Committee on Publication Ethics (COPE): Objectives and achievements
19972012. La Presse Médicale 41, 861866 (2012).
Acronyms
Impact Factor (IF), Scimago Journal Rank (SJR) 18
35. MDPI, “Clarivate Discontinues IJERPH and JRFM Coverage in Web of Science” (2023)
(May 21, 2023).
36. DORA, Changing the narrative: considering common principles for the use of narrative
CVs in grant evaluation (2022) (July 31, 2023).
37. Wellcome Trust, “What Researchers Think About the Culture They Work In” (2020).
1
Supplementary Materials and Methods
Data collection
Global researcher statistics
Total PhD graduate numbers were obtained from the Organisation for Economic Co-operation
and Development (OECD, https://stats.oecd.org) and filtered to remove graduates of “Arts and
Humanities” to better focus on growth of graduates in Science, Technology, Engineering, and
Mathematics (STEM). Other sources were consulted to complement OECD data with data for
China and India (NSF, 2022; Zwetsloot et al., 2021). This choice was made to improve
robustness by ensuring the inclusion of these two major populations did not affect data trends.
However, these sources used independent parameters for assessment, and lacked data past
2019, and so while we could use estimates for China and India for 2020, we ultimately chose
to show only the OECD PhD data in Fig. 1A. Figure 1supp1 considers the addition of these
external data and includes projections to 2022 using quadratic regression given the plateauing
trend.
We also compared total articles with the United Nations Educational, Scientific and Cultural
Organization (UNESCO) data on researchers-per-million (full time equivalent) from the Feb
2023 release of the UNESCO Science, Technology, and Innovation dataset
(http://data.uis.unesco.org, 9.5.2 Researchers per million inhabitants). Figure 1supp1
considers these data, including projections to 2022, using a linear regression model given
trends.
Only ~0.1% of journals in the overlap of the Scimago and Web of Science databases had their
sole category listed as “Arts and Humanities,” and so we ran analyses with or without those
journals, which gave the same result. Strictly Arts and Humanities journals are not retained in
our final datasets being analysed.
Publisher and journal-level data
Total articles published per year was obtained from Scimago (Scimago, 2023). Historical data
(1999 to 2022) for total number of articles, total citations per document over 2 years, the
Scimago Journal Rank (SJR) metric, and total references per document were obtained from
Scopus via the Scimago web portal (https://www.scimagojr.com/journalrank.php). Scimago
yearly data were downloaded with the “only WoS journals” filter applied to ensure the journals
we include here were indexed by both Scopus and Web of Science (Clarivate). Within-journal
self-citation rate was obtained from Scimago via web scraping.
Historical Impact Factor data (2012-2022) for a range of publishers (16,174 journals across
BMC, Cambridge University Press, Elsevier, Emerald Publishing Ltd., Frontiers, Hindawi,
Lippincott, MDPI, Springer, Nature, Oxford University Press, PLOS, Sage, Taylor & Francis,
2
and Wiley-Blackwell) was downloaded from Clarivate. Due to the download limit of 600
journals per publisher, these IFs represent only a subset of all IFs available.
Rejection rates were collected from publishers in a variety of ways: 1) obtained from online
available publisher’s reports (Frontiers: https://progressreport.frontiersin.org/peer-review)), 2)
given upon request (PLOS, Taylor & Francis) and 3) web scraping of publicly-available data
extracted from the journal or company websites (MDPI, Hindawi, Elsevier via
https://journalinsights.elsevier.com/). Frontiers rejection rate data lack journal-level resolution,
and are instead the aggregate from the whole publisher per year.
Article-level data
Several methods were used to obtain article submission and acceptance times (in order to
calculate turnaround times), along with whether articles were part of a special issue (also
called “Theme Issues”, “Collections” or “Topics” depending on the publisher). PLOS, Hindawi
and Wiley´s turnaround times were extracted directly from their corpus. The latter was shared
with the authors by Wiley upon request, while PLOS’ (https://plos.org/text-and-data-mining/)
and Hindawi’s (https://www.hindawi.com/hindawi-xml-corpus/) are available online. BMC,
Frontiers, MDPI, Nature and Springer data were obtained via web scraping of individual
articles and collecting data in “article information”-type sections. Taylor & Francis turnaround
times were obtained via CrossRef (“CrossRef,” 2023) by filtering all available ISSNs from
Scimago. To obtain Elsevier turnaround times we first extracted all Elsevier related ISSNs
from Scimago, queried these in CrossRef to obtain a list of DOIs, and then web scraped the
data from those articles. We also collected information on whether Elsevier articles were part
of special issues during our web-scraping. However, the resulting data were unusually spotty
and incomplete: for instance, we had journals with only one article total with data on special
issue status, which would falsely suggest that “100%” of articles in that journal were special
issue articles. Ultimately we did not include Elsevier in our analysis of special issue articles.
Data analysis and rationale
Grouping of publishers per Scimago labels
Publisher labels in Scimago were aggregated according to key “brand” names such as
“Elsevier” or “Springer”; e.g. Elsevier BV, Elsevier Ltd and similar were aggregated as Elsevier,
or Springer GmbH & Co, Springer International Publishing AG as “Springer.”
This does not entirely capture the nested publishing structures of certain “publishers” per
Scimago labels. At the time of writing, Elsevier and Springer are both publishers who own
>2500 journals according to self-descriptions. However, our dataset only assigns ~1600
journals to these publishers in 2022 (Fig. 1supp2). Reasons for this discrepancy between self-
reported numbers and our aggregate numbers come from smaller, but independent, publisher
groups operating under the infrastructure of these larger publishing houses. Two examples:
1) Cell Press (> 50 journals) is owned by Elsevier, 2) both BioMed Central (BMC, >200
journals) and Nature Portfolio (Nature, >100 journals) are owned by Springer. Ultimately, we
3
decided that publishing houses large enough to distinguish themselves with their own licensed
names were managed and operated sufficiently independently from their parent corporations
to be kept separate. Nevertheless, our dataset aggregates the majority of Elsevier and
Springer journals under their namesakes, and so we feel the data we report are a
representative sampling, even if we caution that interpretation of trends in “Elsevier”, “Nature”,
or other publishers, should consider this caveat regarding nested publisher ownership.
Our choice of publishers to highlight in comparisons required careful judgements. Our goal of
characterising strain meant that we had to focus on publishers that were sufficiently “large” as
far as our strain metrics were concerned. We included publishers like Hindawi and Public
Library of Science (PLOS) because they were uniquely ‘large’ in terms of certain business
models. PLOS is the largest publisher in terms of articles per journal per year (Fig. 1C), while
Hindawi is a major publisher in terms of publishing articles under the Special Issue model (Fig.
2) that is also of current public interest (Quaderi, 2023). We also retained BMC and Nature as
independent entities in our study, as these publishers offer relevant comparisons among
publishing models. BMC is a for-profit Open Access publisher that operates hundreds of
journals, much like Hindawi, Frontiers, and MDPI. Nature is a hybrid model publisher that
includes paywalled or Open Access articles, publishes more total articles than BMC (Fig. 1B),
and was a distinct publishing group for which we could collect systematic data on Special
Issue use and turnaround times (Fig. 2, Fig. 3). We were also able to collect a partial sampling
of those data from Springer, but to merge the two would have caused Nature to contribute a
strong plurality of trends in Springer data in Fig. 2 and Fig. 3, obscuring the trends of both this
nested publishing house and of the remaining majority of Springer journals: indeed the
proportion of Special Issues (Fig. 2), turnaround times (Fig. 3), Impact Inflation and self-citation
(Fig. 4B) of Nature is significantly different from other Springer journals, sometimes by a wide
margin.
In some cases, journal size was also a relevant factor for comparisons across publishers. As
emphasised by the high number of articles per journal by PLOS, MDPI, and Frontiers (Fig.
1C), some publishers publish hundreds to thousands of articles per journal annually, while
others publish far less. The age of journals was also tied to this article output, as newer
journals publish fewer articles, but grow to publish thousands of articles annually in later years.
We therefore considered journal size throughout (Fig. 1supp3), and in metrics like self-
citations, which were censored for only journals receiving at least 1000 citations per year.
These filters were applied to ensure comparisons across journals and publishers were being
made fairly: for instance, small journals have fewer articles to self-cite to, and highly specific
niche journals may have high rates of self-citation for sensible reasons. This was especially
important for comparisons at the publisher level, as some publishers have increased their
number of journals substantially in recent years (Fig. 1supp2), meaning a large fraction of their
journals are relatively young and less characteristic of the publisher’s trends according to their
better-established journals.
Rejection rates
The analysis of rejection rates comes with important caveats: these data come from non-
standardised data sources (each publisher decides how rejection rates are reported) and we
use voluntarily-reported rather than systematic data (volunteer bias).
4
In most cases, publishers track the total submissions, rejections, and acceptances over a set
period of time. This can sometimes be just a few months, or it can be the length of a whole
year. We defined rejection rate as a function of accepted, rejected, and total submissions,
depending on the data that were available for each publisher. However, this definition fails to
account for the dynamic status of articles as they go through peer review. For instance,
publishers may define “rejection” as any article sent back to the authors, even if the result was
ultimately “accept.” These differences can drastically affect the absolute value of rejection
rates, as some publishers may count Schrödinger-esque submissions where the underlying
article is tallied as both “rejected” and “accepted” with different timestamps.
We will also note that while Frontiers and MDPI provided their rejection rates publicly, we were
forced to assemble their rejection rates manually. For Frontiers, we explicitly use 1 - (accepted
articles / total submissions), however Frontiers reports an independent number they call
“Rejected” articles that gives a lower number if used in the formula: rejected articles / total
submissions (Frontiers data collected from https://progressreport.frontiersin.org/peer-review,
accessed Sept. 4th, 2023). Meanwhile, MDPI rejection rate data were available via “Journal
statistics” web page html code as “total articles accepted” and “total articles rejected.” We
therefore defined total submissions to MDPI as the sum of all accepted and rejected articles.
On the other hand, Hindawi reported their rejection rates publicly on journal pages as
“acceptance rate,” although the underlying calculation method is not given.
Finally, rejection rates are intrinsically tied to editorial workload capacity and total submissions
received. For some journals, total workload from submissions has trade-offs with what can be
feasibly edited. For instance, eLife initially saw longer times to deciding on whether to desk
reject an article or not during a trial that committed to publish all articles after peer review at
the author’s discretion (eLife, 2019). This change to longer processing times was presumably
instinctive editor behaviour to avoid the ensuing workload of accepting all articles for peer
review, and so the commitment of time associated. In other journals, relatively few articles per
editor might be submitted, and so more articles could be considered for publication and
retained for reassessment following revisions, perhaps visible as broader turnaround time
distribution curves (Fig. 3B). Thus, we will stress that the absolute rejection rate itself is not a
measure of quality or rigour, but rather reflects the balance between editorial capacity, journal
scope and mission, and the total submissions received.
While we could not standardise the methodology used to calculate rejection rates across
publishers, we make the assumption that publishers have at least maintained a consistent
methodology internally across years. For this reason, while comparing raw rejection rates
comes with many caveats, comparing the direction of change itself in rejection rates shown in
Fig. 4A should be relatively robust to differences between groups.
Impact inflation
“Impact inflation” is a new synthetic metric we define, and so here we will take care to detail
its characteristics, caveats, and assumptions in depth. Principally, this metric uses the ratio of
the Clarivate Impact Factor (IF) to the Scimago Journal Rank (SJR).
5
One of the most commonly used metrics for judging journal impact is provided by Clarivate’s
annual Journal Citation Reports: the journal IF. Impact Factor is calculated as the mean total
citations per article in articles recently published in a journal, most commonly the last two
years. The formula for IF is as follows, where y represents the year of interest (Garfield, 2006):
𝐼𝐹
𝑦=𝑇𝑜𝑡𝑎𝑙𝑐𝑖𝑡𝑎𝑡𝑖𝑜𝑛𝑠𝑦
𝑇𝑜𝑡𝑎𝑙𝑝𝑢𝑏𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑠𝑦−1 + 𝑇𝑜𝑡𝑎𝑙𝑝𝑢𝑏𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑠𝑦−2
The IF of a journal for 2022 is therefore:
𝐼𝐹2022 =𝑇𝑜𝑡𝑎𝑙𝑐𝑖𝑡𝑎𝑡𝑖𝑜𝑛𝑠2022
𝑇𝑜𝑡𝑎𝑙𝑝𝑢𝑏𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑠2021 + 𝑇𝑜𝑡𝑎𝑙𝑝𝑢𝑏𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑠2020
The name “Impact Factor” refers to this calculation when done by Clarivate using their Web of
Science database. However, the exact same calculation can be performed using other
databases, including the Scopus database used by Scimago; indeed, these metrics are highly
correlated (Fig5supp4A). Because of mass delistings by Clarivate in their 2023 Journal
Citation Reports that affected many journals (Quaderi, 2023), which were not delisted in
Scimago data, there is a decoupling of the Scimago Cites/Doc and Clarivate IF in 2022 data
(F = 6736 on 1, 3544 df, adj-R2 = 0.72) compared to previous years 2012-2021 (F = 43680 on
1, 13397 df, adj-R2 = 0.77). The overall trends in Impact Inflation are robust to use of either
Cites/Doc or IF in 2021 or 2022 (Fig. 5A, Fig. 5supp4). For continuity with discussion below,
we will describe IF as a metric of journal “prestige,” as IF is sometimes used as a proxy of
journal reputation.
The Scimago Journal Rank (SJR) is a metric provided by Scimago that is calculated differently
from Clarivate IF. The SJR metric is far more complex, and full details are better described in
(Guerrero-Bote and Moya-Anegón, 2012). Here we will provide a summary of the key
elements of the SJR that inform its relevance to Impact Inflation.
Supplementary Table 1: methodological differences between SJR and Impact Factor
(adapted from (Guerrero-Bote and Moya-Anegón, 2012))
SJR
Impact Factor
Database
Scopus
Web of Science
Citation time frame
3 years
2 years
Self-citation contribution
Limited
Unlimited
Field-weighted citation
Weighted
Unweighted
Size normalisation
Citable document rate
Citable documents
Citation networks considered?
Yes
No
The SJR is principally calculated using a citation network approach (visualised in Fig. 5supp7).
The reciprocal relationship of citations between journals is considered in the ultimate rank of
SJR “prestige,” including a higher value placed on citations between journals of the same
6
general field. The formula used by Scimago further limits the amount of prestige that one
journal can transfer to itself or to another journal. This is explicitly described as a way to avoid
“problems similar to link farms with journals with either few recent references, or too
specialized” (Guerrero-Bote and Moya-Anegón, 2012); “link farms” are akin to so-called
“citation cartels” described in (Abalkina, 2021; Fister et al., 2016). This is the most important
difference between IF and SJR, as IF does not consider the source of where citations come
from, while SJR does. As a result, SJR does not permit journals with egregious levels of self-
or co-citation to inflate the ultimate SJR prestige value.
The ratio of IF/SJR can therefore reveal journals whose total citations (IF) come from
disproportionately few citing journals. MDPI journals have a much lower SJR compared to
their IF. The reason for this is exemplified from the ratios of citations in/out for the flagship
MDPI journals International Journal of Molecular Sciences, International Journal of
Environmental Research and Public Health, and Sustainability (see Fig. 5supp7). These three
journals not only have high rates of within-journal self-citation (9.4%, 11.8%, 15.3%
respectively in 2022), they and other MDPI journals further contribute the plurality of the total
citations to each other (MDPI, 2021), and to other journals (e.g. Hindawi BioMed Research
International), which outside of the MDPI network are often not reciprocated (Fig. 5supp7).
Importantly, growth of articles per journal is not an intrinsic factor behind this disparity.
Frontiers has seen a similar level of growth of its articles per journal as MDPI (Fig. 1C),
enabled by using the special issues model (Fig. 2), but has far lower Impact Inflation scores
(Fig. 5A). Frontiers also receives more diverse citations coming from a wider pool of journals,
and only sparingly from other Frontiers journals (Fig. 5supp7). Importantly, we cannot
comment on why these behaviours exist. What can be said is that the MDPI model of
publishing seems to attract authors that cite within and across MDPI journals far more
frequently than authors publishing with comparable for-profit Open Access publishers like
Hindawi, Frontiers, or BioMed Central (BMC). Indeed, in a self-analysis published in 2021,
MDPI’s rates of within-publisher self-citation (~29%, ~500k articles) were highly elevated
compared to other publishers of similar size (not an opinion shared by MDPI). Their rates were
also higher than IEEE (~5%, ~800k articles), Wiley-Blackwell (~17%, ~1.2m articles), and
Springer Nature (~24%, ~2.5m articles), lower only compared to Elsevier (~37%, ~3.1m
articles) (MDPI, 2021).
7
Supplementary Figure 1: analysis of within-publisher self-citation rate performed by MDPI
(MDPI, 2021) in response to Oviedo-Garcia (Oviedo-García, 2021). The original interpretation
of this figure, as presented by MDPI, is: “It can be seen that MDPI is in-line with other
publishers, and that its self-citation index is lower than that of many others; on the other hand,
its self-citation index is higher than some others.” Our data in Fig. 5B (2022) and Fig. 5supp6
(2021) suggest instead that established MDPI journals receiving >1000 citations per year have
uniquely high rates of within journal self-citation, which are significantly different from other
publishers. This filter for only journals receiving >1000 citations is key, as due to the growth of
MDPI journals in recent years, not including this caveat can give the false impression that
MDPI, overall, has comparable rates of within-journal self-citation due to the many recent
journals with relatively few articles that cannot easily cite themselves (but can cite other MDPI
journals).
The ratio of IF to SJR (or of the Scimago proxy Cites/Doc to SJR) therefore assesses how two
different citation-based metrics compare. The first metric (IF) is source-agnostic, counts the
raw volume of citations and documents, and outputs a prestige score. The second metric has
safeguards built in that prevent citation cartel-like behaviour from inflating a journal’s prestige,
and so if a journal receives a large number of its citations from just a few journals, it will not
receive an SJR score that is proportional to its IF.
Comment on the advertisement of IF as a metric of prestige
It is striking to note that most journals celebrate a year-over-year increase in IF, however our
study shows that IF itself has become inflated, like a depreciating currency, by the huge growth
in total articles and total citations within those articles (Fig. 5supp3, Fig. 5supp5). As a result,
8
unless IF is considered as a relative rank, the value of a given IF changes over time. Indeed,
a publisher whose journals had an average Cites/Doc of “3.00” in 2017 was somewhat high
within our publisher comparisons, however in 2022 a Cites/Doc of “3.00” is near the lowest
average Cites/Doc across publishers (Fig. 5supp1). This rapid inflation, i.e. depreciation of IF-
like metrics, does not affect the relative comparisons made in Clarivate Journal Citation
Reports’ IF rank or IF percentile. Publishers often report absolute IFs, however our analysis
suggests the more accurate IF-based metric to report would be a relative rank, such as IF rank
or percentile within a given category.
As our impact inflation metric is similarly proportional to IF itself, we would likewise recommend
adaptations of impact inflation to compare relative ranks, such as quartiles. Unlike IF, the
impact inflation metric already normalises by journal size and field by calculating SJR through
citable document rate, rather than citable documents (Supplementary Table 1), making field-
specific normalisation less important for comparisons of impact inflation across journals.
Data and code availability
At this time (Sept 2023) we were legally advised to withhold sharing our code and data files.
Our work was done in accordance with UK government policy on text mining for non-
commercial research (Gov.uk, 2021), and we anticipate being able to share much of code and
data in the future. Code and data will be made available to peer reviewers to ensure a robust
peer review process.
We used R version 4.3.1 (R Core Team 2023) and the following R packages: agricolae v.
1.3.6 (de Mendiburu 2023), emmeans v. 1.8.7 (Lenth 2023), ggtext v. 0.1.2 (Wilke and Wiernik
2022), gridExtra v. 2.3 (Auguie 2017), gt v. 0.9.0 (Iannone et al. 2023), gtExtras v. 0.4.5 (Mock
2022), here v. 1.0.1 (Müller 2020), hrbrthemes v. 0.8.0 (Rudis 2020), kableExtra v. 1.3.4 (Zhu
2021), magick v. 2.7.5 (Ooms 2023), MASS v. 7.3.60 (Venables and Ripley 2002), multcomp
v. 1.4.25 (Hothorn, Bretz, and Westfall 2008), multcompView v. 0.1.9 (Graves, Piepho, and
Sundar Dorai-Raj 2023), MuMIn v. 1.47.5 (Bartoń 2023), mvtnorm v. 1.2.2 (Genz and Bretz
2009), patchwork v. 1.1.2 (Pedersen 2022), scales v. 1.2.1 (Wickham and Seidel 2022), sjPlot
v. 2.8.15 (Lüdecke 2023), survival v. 3.5.5 (Therneau T 2023), TH.data v. 1.1.2 (Hothorn
2023), tidyverse v. 2.0.0 (Wickham et al. 2019) and waffle v. 0.7.0 (Rudis and Gandy 2017).
9
Package citations
Auguie, Baptiste. 2017. gridExtra: Miscellaneous Functions for “Grid” Graphics.
https://CRAN.R-project.org/package=gridExtra.
Bartoń, Kamil. 2023. MuMIn: Multi-Model Inference. https://CRAN.R-
project.org/package=MuMIn.
de Mendiburu, Felipe. 2023. agricolae: Statistical Procedures for Agricultural Research.
https://CRAN.R-project.org/package=agricolae.
Genz, Alan, and Frank Bretz. 2009. Computation of Multivariate Normal and t Probabilities.
Lecture Notes in Statistics. Heidelberg: Springer-Verlag.
Graves, Spencer, Hans-Peter Piepho, and Luciano Selzer with help from Sundar Dorai-Raj.
2023. multcompView: Visualizations of Paired Comparisons. https://CRAN.R-
project.org/package=multcompView.
Hothorn, Torsten. 2023. TH.data: TH’s Data Archive. https://CRAN.R-
project.org/package=TH.data.
Hothorn, Torsten, Frank Bretz, and Peter Westfall. 2008. “Simultaneous Inference in General
Parametric Models.” Biometrical Journal 50 (3): 346–63.
Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and
JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. https://CRAN.R-
project.org/package=gt.
Lenth, Russell V. 2023. emmeans: Estimated Marginal Means, Aka Least-Squares Means.
https://CRAN.R-project.org/package=emmeans.
Lüdecke, Daniel. 2023. sjPlot: Data Visualization for Statistics in Social Science.
https://CRAN.R-project.org/package=sjPlot.
Mock, Thomas. 2022. gtExtras: Extending “gt” for Beautiful HTML Tables. https://CRAN.R-
project.org/package=gtExtras.
Müller, Kirill. 2020. here: A Simpler Way to Find Your Files. https://CRAN.R-
project.org/package=here.
Ooms, Jeroen. 2023. magick: Advanced Graphics and Image-Processing in r.
https://CRAN.R-project.org/package=magick.
Pedersen, Thomas Lin. 2022. patchwork: The Composer of Plots. https://CRAN.R-
project.org/package=patchwork.
R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna,
Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
10
Rudis, Bob. 2020. hrbrthemes: Additional Themes, Theme Components and Utilities for
“ggplot2”. https://CRAN.R-project.org/package=hrbrthemes.
Rudis, Bob, and Dave Gandy. 2017. waffle: Create Waffle Chart Visualizations in r.
https://CRAN.R-project.org/package=waffle.
Therneau T (2023). A Package for Survival Analysis in R. https://CRAN.R-
project.org/package=survival
Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with s. Fourth. New
York: Springer. https://www.stats.ox.ac.uk/pub/MASS4/.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino
McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.”
Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, and Dana Seidel. 2022. scales: Scale Functions for Visualization.
https://CRAN.R-project.org/package=scales.
Wilke, Claus O., and Brenton M. Wiernik. 2022. ggtext: Improved Text Rendering Support for
“ggplot2”. https://CRAN.R-project.org/package=ggtext.
Zhu, Hao. 2021. kableExtra: Construct Complex Table with “kable” and Pipe Syntax.
https://CRAN.R-project.org/package=kableExtra.
Supplementary gures and tables
Fig1supp1: the growing disparity between total arcles per year and acve researchers is robust to
use of alternate datasets. Doed lines indicate esmated trends. A) OECD data complemented with
total STEM PhD graduates from India and China (dashed red line) does not alter the paern of an
overall decline in recent years (Fig. 1A). B) The rao of total arcles to total PhD graduates has gone
up substanally since 2019. C) UNESCO data instead using total acve researchers (full-me
equivalent) per million people shows a similar trend. Of note, this proxy for acve researchers may
include non-publishing sciensts (private industry, governmental), that are not parcipang in the
strain on scienc publishing in the same way academic sciensts are. D) Nevertheless, using UNESCO
data the rao of total arcles to total acve researchers has gone up substanally since 2019.
Fig1supp2: growth in total journals by publisher. Between 2013-2022, Elsevier, Springer, Taylor &
Francis, MDPI, and Nature have added to their total journals noceably. Note: we have only analysed
journals indexed in both Scopus and Web of Science, and also collected journals under Publishers
according to their licensed Publisher names. Subsidiary publishers under the umbrella of larger
publishers are not included in larger publisher totals. For example, both BioMed Central (BMC) and
Nature porolio (Nature) are subsidiaries of Springer Nature (Springer), but host a large number of
journals and license under a non-Springer name, and so are treated as separate enes in our study.
Fig1supp3: the rise of megajournals. We recover trends supporng the arcle by Ioannidis et al. (28)
who described the “rise of megajournals. Specically, we see a decline in the number of journals
publishing <1 paper/week, but sharp increases in the number of journals publishing over a paper per
day. Scienc publishing has therefore been concentrang more and more arcles into fewer journals
proporonally, which also coincides with a slight decline in the number of journals publishing only a
few arcles per year.
Fig2supp1: proporon of arcles published in regular vs. special issues. Underlying data are the same
as in Fig. 2. Line plots are shown to beer depict the year-by-year evolving proporon of special issue
arcles to regular arcles. The decline in Wiley arcles from 2020-2022 is an artefact of web scraping
where total data availability declined in these years. As shown in Fig. 1B, Wiley overall arcle output
increased slightly in recent years.
Fig2supp2: change in special issue between 2016 and 2022. Certain groups publish the majority of
their arcles through special issues. Mean proporons of arcles published through regular or special
issues shown.
Fig3supp1: heterogeneity in journal mean turnaround mes by publisher. Underlying data same as
Fig. 3B. Here Violin plots provide an alternate depicon of the density of turnaround me distribuons
of all arcles within their publishing house. “Tay. & Fran.” = Taylor & Francis.
Fig3supp2: arcle turnaround mes split by normal or special issue status. Across publishers, special
issue arcles have lower turnaround mes, oen by signicant margins (for each year: Tukey HSD, p <
.05 = *). The only excepon to this trend is Springer, which had higher turnaround mes for special
issue arcles. Of note, the way that special issues are organised can vary across journals and
publishers, which could explain the dierences in the extent of these trends by publisher. Error bars
represent standard error.
Fig4supp1: 2022 rejecon rates by publisher, split across dierent parameters. Using a general linear
model, we found no signicant eect of total documents (Fig. 4B), citaons per document (A), Scimago
Journal Rank (B), or journal age (C) on a journal’s 2022 rejecon rates across publishers. We chose to
invesgate young journals (≤ 10 years) to avoid comparing long-established journals to new journals
that might have dierent needs for growth.
Fig4supp2: Rejecon rates relave to proporons of special issue arcles. For Hindawi and MDPI,
two publishers that we could analyse, there was a signicant correlaon between 2022 journal
rejecon rates and their share of arcles published through special issues.
Fig4supp3: the decline in MDPI rejecon rates is present across journals of dierent size classes. A
steady decline in rejecon rates began between 2019-2020 (Fig. 4A) alongside growth in journal size
(larger bubbles here).
Fig5supp1: raw Cites/Doc and SJR informing the Impact Inaon metric. A) Cites/Doc has been
increasing year-over-year across publishers, with a notable upck beginning aer 2019. Here we
describe the recent inaon of journal IF (with Cites/Doc as our proxy), suggesng the relave value
of a given absolute IF number (e.g. “IF = 3”) has decreased more rapidly than in years prior to 2019. B)
The SJR has remained relavely constant in recent years, as expected since this metric is normalised
for journal size and rate of citable documents generated, rather than raw total documents (see (18)).
This suggests that the year-over-year increase in Impact Inaon (Fig. 5supp2) we’ve observed is due
to increased total citaons by increasing total arcles, but those citaons are not weighted as
“presgious” in a network-adjusted metric compared to pre-2019 years.
Fig5supp2: there has been a universal increase in Impact Inaon independent of journal size across
all publishers. Also see Fig. 5supp5A.
Fig5supp3: a paral contributor to the increasing total citaons being generated is an exponenal
increase in references per document between 2018-2021. As such, not only is more work being
produced (total arcle growth), but that work is also proporonally generang more citaons than
arcles would be in past years. Here we will note that this growth overlapped the COVID-19 pandemic,
which provided an excess of potenal wring me to sciensts. However, growth in references per doc
began already between 2018 and 2019, suggesng the eect of COVID-19 cannot fully explain this
change. The ensuing year of 2020 also coincides with acceleraon of arcles published through special
issues (Fig. 2supp1). Thus the growth in references per document is correlated both with a burst of
special issue publishing, and world events. References per document also connued to increase
between 2021 and 2022 despite measures around COVID-19 relaxing in 2022 – albeit there is indeed
a marked decrease in the rate of growth. A full understanding of the inuence of COVID-19 on this
growth in references per document, and how much references per document explains the universal
increase in impact factor (Fig. 5supp1,2) will await data from 2023 where the impact of COVID-19 is
further lessened, and normalised for the signicant delisngs that Clarivate performed in March 2023
that have had a marked eect on the calculaon of impact factor (Fig. 5supp4A).
Fig5supp4: validaon of Scimago Cites/Doc (2 years) as a proxy of Clarivate journal IF. A) Prior to
2022, “Cites/Doc (2 years)” and Clarivate IF have a correlaon of adj-R2 = 0.77 (A’), but due to mass
delisngs by Clarivate (but not Scimago) aecng 2022 journal IFs, there was a decoupling of this
correlaon for 2022 (A’’ : adj-R2 = 0.72). Regardless, Cites/Doc (2 years) informed by the Scopus
database is a good proxy of Clarivate Web of Science IF. B-C) Impact Inaon calculated using a subset
of Clarivate IFs we could download for our publishers of interest in 2021 (B) and 2022 (C). In both years,
MDPI has signicantly higher Impact Inaon compared to all other publishers except Hindawi. Here
we leave an example in (B) of what is meant by “major outliers” in Fig. 5, to show that plong the full
x-axis range does not change trends, but is aesthecally disguisng.
Fig5supp5: evoluon of Impact Inaon and within-journal self-citaon between 2016 and 2022. A)
Impact Inaon has increased universally across publishers (absolute values summarised in Table 2).
B) Within-journal self-citaon has increased in recent years specically for publishers that grew
through use of the special issue model of publishing: MDPI, Froners, and Hindawi. Notably, MDPI has
higher self-citaon rates than any other publisher, exceeding previous highs from 2016 (Elsevier, Taylor
& Francis) by over one percentage point.
Fig5supp6: within-journal self-citaon rates from 2021, supporng the trend in 2022 that MDPI
uniquely has signicantly higher self-citaon rates compared to all other publishers. A dierence
between 2021 and 2022 is that in 2022, MDPI and Taylor & Francis were not signicantly dierent (P
> .05). In 2021, this dierence was signicant (P = 3e-7).
Fig5supp7: example citaon networks of single journals from Scimago. Journals were selected from
the largest journals by publisher. Journal citaon reciprocity depicted with grey arrows for incoming
citaons, and green arrows for outgoing citaons. MDPI journals make up large fracons of the total
incoming citaons of their own journals, uniquely true of MDPI and not other publishers in our
analysis. This result is in keeping with MDPI themselves, who reported a ~29% within-MDPI citaon
rate (shown in supplementary materials and methods). High rates of Impact Inaon of Hindawi
journals may come from disproporonate citaons received from MDPI journals. For instance, a
plurality of citaons to BioMed Research Internaonal (row 2, column 1) come as large chunks (thick
grey arrows) from MDPI journals (Internaonal Journal of Molecular Sciences, Internaonal Journal of
Environmental Research and Public Health, Nutrients, Anoxidants, Cancers, etc…). A similar paern
is seen for Mathemacal Problems in Engineering (row 2, column 2): Sustainability, Mathemacs,
Applied Sciences (Switzerland), Symmetry, Sensors, etc… Because the Scimago Journal Rank metric has
an upper limit on the presge a single source can provide, the large number of citaons individual
MDPI journals are exporng may be an important factor leading to universal trends in Impact Inaon.
A full-resoluon version of this gure is available online at doi: 10.6084/m9.gshare.24203790.
Table 1supp1: Change in submied papers relave to the previous month for the 25 largest MDPI
journals. On March 23rd 2023 Clarivate announced the delisng of the MDPI agship journal
Internaonal Journal of Environmental Research and Public Health (IJERPH), as well as Journal of Risk
and Financial Management (JRFM). Following this, submissions to IJERPH plummeted by 73
percentage points in April 2023 compared to March 2023, which already showed a slowdown overall
compared to February 2023. Moreover, submissions to MDPI journals in general were down in April
2023 across the board compared to March. A similar paern was seen in early 2022 following the
Chinese Academy of Science release of their “Early Warning Journal List” trial published Dec 31st 2021,
which featured mulple MDPI journals. These paerns demonstrate that external authories, such as
Clarivate or naonal academies of science, can have profound impacts on author submission
behaviour, despite opaque methodologies surrounding their decisions to list or delist journals.
... No importa cómo de especializada sea su área de investigación, ¡hay una revista para usted! (Hanson et al., 2024;Ghasemi et al., 2022) Una función de esencial importancia que el editor debe llevar a cabo es la de jugar el papel de mediador: para evitar posibles tráficos de influencias, favoritismos, y otras actividades indeseables, los autores no deben saber quiénes son sus revisores. Y el editor, en medio, hace posible que este proceso se lleve a cabo de forma anónima 1 . ...
... Del mismo modo, el número de revistas científicas, se calcula, ha aumentado de forma incesante hasta llegar casi a las centenas de millar [Figura 1]. Y paradójicamente, este aumento continuo del número de revistas y artículos publicados, no parece estar produciendo más que efectos negativos en el ecosistema científico (Hanson et al., 2024). ...
... Nota. En 1b) la escala vertical está, impúdicamente presentada en escala logarítmica (Hanson et al., 2024;Ghasemi et al., 2022). ...
Article
Full-text available
Las ideas deben ser públicas. No se construye conocimiento si no se comparten ideas. No se avanza si no es posible trepar a hombros de otros. La publicación científica no es conveniente o aconsejable: es un imperativo. Y, sin embargo, llevar a cabo esta labor entraña un problema complejo para el que no parece haber soluciones sencillas. Hoy, la comunidad científica está compuesta por millones de individuos. Gestionar la labor de comunicación de ideas en este escenario de forma correcta, justa y funcional, no resulta una tarea fácil. El texto que sigue es un análisis somero, tal vez sesgado y seguramente vehemente, sobre el estado actual de la publicación científica. Se discutirá cómo ésta ha secuestrado la actividad científica tanto por el esfuerzo y recursos que se deben invertir en ella, como por su influencia a la hora de decidir qué se debe investigar. Y cómo, en connivencia con los investigadores, las revistas han participado del deterioro y banalización de la comunicación científica y la transformación de la ocupación científica en un negocio.
... Owing to a global rush to publish stimulated by competitive mechanisms being introduced in many countries, the ever-increasing number of submissions has put pressure on editors and reviewers. "In 2022, the article total was 47% higher than in 2016, which has outpaced the limited growth, if any, in the number of practicing scientists" (Hanson et al., 2023). Engaging high-quality scholars in the review process is becoming more and more arduous as demand (including research project evaluations, appointments, tenure, and promotions) remarkably overcomes supply. ...
Article
Full-text available
Purpose Scholars face an unprecedented ever increasing demand for acting as reviewers for journals, recruitment and promotion committees, granting agencies, and research assessment agencies. Consequently, journal editors face an ever increasing scarcity of experts willing to act as reviewers. It is not infrequent that reviews diverge, which forces editors to recur to additional reviewers or make a final decision on their own. The purpose of the proposed bibliometric system is to support of editors’ accept/reject decisions in such situations. Design/methodology/approach We analyse nearly two million 2017 publications and their scholarly impact, measured by normalized citations. Based on theory and previous literature, we extrapolated the publication traits of text, byline, and bibliographic references expected to be associated with future citations. We then fitted a regression model with the outcome variable as the scholarly impact of the publication and the independent variables as the above non-scientific traits, controlling for fixed effects at the journal level. Findings Non-scientific factors explained more than 26% of the paper’s impact, with slight variation across disciplines. On average, OA articles have a 7% greater impact than non-OA articles. A 1% increase in the number of references was associated with an average increase of 0.27% in impact. Higher-impact articles in the reference list, the number of authors and of countries in the byline, the article length, and the average impact of co-authors’ past publications all show a positive association with the article’s impact. Female authors, authors from English-speaking countries, and the average age of the article’s references show instead a negative association. Research limitations The selected non-scientific factors are the only observable and measurable ones to us, but we cannot rule out the presence of significant omitted variables. Using citations as a measure of impact has well-known limitations and overlooks other forms of scholarly influence. Additionally, the large dataset constrained us to one year’s global publications, preventing us from capturing and accounting for time effects. Practical implications This study provides journal editors with a quantitative model that complements peer reviews, particularly when reviewer evaluations diverge. By incorporating non-scientific factors that significantly predict a paper’s future impact, editors can make more informed decisions, reduce reliance on additional reviewers, and improve the efficiency and fairness of the manuscript selection process. Originality/value To the best of our knowledge, this study is the first one to specifically address the problem of supporting editors in any field in their decisions on submitted manuscripts with a quantitative model. Previous works have generally investigated the relationship between a few of the above publication traits and their impact or the agreement between peer-review and bibliometric evaluations of publications.
... According to Publons, the number of reviewer invitations to secure a single reviewer rose from 1.9 in 2013 to 2.4 in 2018 (60). The main reasons are likely the absence of an incentive and the declining ratio of qualified reviewers to the number of papers needing reviewers (61). In most cases, reviewing does not lead to monetary compensation or advance a reviewer's career but is done for social reasons (e.g., 41% of researchers in the Publons survey gave "reviewing is part of my job" as a reason). ...
Article
Full-text available
These are central yet contentious questions in today's academic discourse. This perspective critically discusses alternative models and revisions to the peer review system. The authors highlight possible changes to the peer review system, with the goal of fostering further dialog among the main stakeholders, including producers and consumers of scientific research. Neither our list of identified issues with the peer review system nor our discussed resolutions are complete. A point of agreement is that fair assessment and efficient change would require more comprehensive and rigorous data on the various aspects of the peer review system. peer review crisis | publication system | scientific community Peer review is intended to support scientific integrity, correct errors, and democratize decisions about publication and funding. Critical voices have raised concerns about scientific errors, unreliability, lack of transparency, and bias in peer review (1-3). Some reviewers indulge in a variety of self-serving behaviors, even obstructing the publication of articles unfavorable to their own research (4) and self-appropriating ideas from unpublished manuscripts accessed during peer review (4). Biases that favor or disfavor research topics, institutions, geographic origin, and demographic characteristics of authors or participants threaten the ideal of impartiality in peer review (1 , 5). The reliability of peer review is low, as reviewers of the same work often disagree with each other's assessments (6). The validity of peer review has also been cast into doubt by studies showing that major manuscript errors are often missed (7). Reviewers may lack expertise in a submission's specific area of research, and reviewers with the requisite expertise often lack the time and incentives to conduct a thorough and careful review. Moreover, peer review is slow, perhaps slower than the modern needs and capabilities of scientific communication. Papers typically take several months to be evaluated (6) and may take years to go through an entire cycle of rejection-revision that commonly occurs before publication. These long timelines compare unfavorably to the accelerating rate of scientific research (8), the instantaneous audience a paper can receive when posted on a preprint server, or the short timelines available for the business, policy, and treatment decisions which research is often meant to inform. Scientists seeking to share findings often encounter a peer review system that is slow, inefficient, costly, haphazard, biased, subject to abuse, and inaccurate (9). Nevertheless, many scientists still see the utility in reviewing for various reasons-such as keeping up with their field, influencing the field's direction, and (for junior scientists) learning how to evaluate research (10). Reviewed and reviewing scientists may become closer in their social network, even when reviewing is single-or double-blind (11), and contribute together to standardizing evaluative criteria in their field (12). More fundamentally, contributing reviews to one's field is a central mechanism for creating a scientific community. This perspective article is based on conversations among the authors and presentations at a recent meeting on scientific reform. Although we do not provide definitive solutions, our aim is to advocate for a reevaluation of the peer review process. To this end, we outline the problems and discuss the pros and cons of a menu of potential solutions. Our argument is outlined broadly in Table 1 .
... According to Publons, the number of reviewer invitations to secure a single reviewer rose from 1.9 in 2013 to 2.4 in 2018 (60). The main reasons are likely the absence of an incentive and the declining ratio of qualified reviewers to the number of papers needing reviewers (61). In most cases, reviewing does not lead to monetary compensation or advance a reviewer's career but is done for social reasons (e.g., 41% of researchers in the Publons survey gave "reviewing is part of my job" as a reason). ...
Article
Full-text available
What is wrong with the peer review system? Is peer review sustainable? Useful? What other models exist? These are central yet contentious questions in today’s academic discourse. This perspective critically discusses alternative models and revisions to the peer review system. The authors highlight possible changes to the peer review system, with the goal of fostering further dialog among the main stakeholders, including producers and consumers of scientific research. Neither our list of identified issues with the peer review system nor our discussed resolutions are complete. A point of agreement is that fair assessment and efficient change would require more comprehensive and rigorous data on the various aspects of the peer review system.
... The community also has, for years now, observed the trend of significantly increasing numbers of publications [8]. The powers of artificial intelligence (AI), specifically generative AI models, which we are merely beginning to comprehend, are further highlighting the reality that producing somewhat reasonable scholarly articles en masse is far from impossible. ...
Article
Social media were initially thought of as spaces of socialization and entertainment. However, with their growing share in the general internet usage they have been increasingly employed as a means of acquiring knowledge, including health-related information. The unparalleled reach of social media has brought consequences to this phenomenon that were previously unheard of. On one hand, it has been promoting a true revolution of the concept of healthcare by providing to patient communities the tools for actualizing long-standing demands, such as providing accessible information, sharing personal experiences, providing a space for self-management, and conferring identity to both groups and individuals. On the other, there are multiple challenges arising from the impacts of social network services (SNS) in how society organizes itself and exchanges information. Some of those could lead to significant hazard both in personal and collective spheres due to wrong self-diagnostics and medical advice and the spectacularization of care, among other circumstances. These events have brought up the need to rethink the role of social media in healthcare by recognizing its relevance, and then, studying the paths for strengthening its benefits and reducing its risks. This paper attempts to identify general aspects that characterize this interaction: the historical background of the relationship between health and truth, the reach of verified versus unverified information and the variance of its quality, and the community-building potential of social media.
Article
Credibility in the production and circulation of scientific knowledge is not absolutely guaranteed, but pursued through the adoption of practices that are considered right and good, as emphasised by Merton. Knowledge production has been extensively analysed in terms of various aspects such as replicability, transparency of procedures and methodological correctness. However, less attention has been paid to the circulation of knowledge and the role of peer review in this process. The peer review process, which involves the judgement of external experts on the authenticity of a scientific article, is crucial for the credibility and reputation of published research: it not only filters knowledge but also legitimises it, helping to establish a boundary between science and non-science, as many have pointed out. Today, however, the peer review system is being challenged by two main phenomena: the introduction of quantitative evaluation criteria and the rise of new publishing groups with aggressive commercial practices are affecting the number of scientific publications, thus altering the balance that has historically ensured the credibility and quality of research.
Article
Background A growing movement of researcher-driven publishing projects has emerged in response to several challenges and shifts within the academic publishing landscape. One publishing initiative in this area are what we term community-led publishing projects (CPPs), which are produced entirely by academics, librarians and students without any involvement of the commercial publishing industry. CPPs are part of a growing global movement but their values and practices remain underexplored. This article presents findings on the landscape of CPPs at the University of Cambridge. Methods A landscape analysis was undertaken to identify and describe the various CPPs at Cambridge. A subset of 10 journal editors were subsequently interviewed to explore the practices, motivations and needs of these initiatives. Results Thirty-four CPPs were identified across a range of disciplines with a variety of publishing practices, open access status and professionalisation. From the interviews, CPPs were driven by an array of motivations including volunteers who are dedicated to their disciplines, who care for publishing, have a responsibility to disseminate their own research and who acknowledge the value these projects bring. They have complicated relationships with open access, being encouraged by public access to knowledge while maintaining a desire for print and being critical of some cultures of publishing brought on by the turn to openness. Practically, they employ a “DIY” approach due to the availability of resources but in doing so adhere to professional standards. Their success relies on collaboration and support, leveraging networks, technical and financial backing, and ensuring sustainability through careful handover. Conclusions This study helps us better understand the scope and practices of community-led publishing at a research-intensive university in the UK. It shows that CPPs are valuable for a variety of reasons and that universities, funders and governments should support such projects to ensure the preservation of unique scholarly content.
Article
Full-text available
L’étendue des enjeux liés à l’impact environnemental des ressources documentaires électroniques demeure encore floue et complexe. Auparavant, lorsque les bibliothèques étaient en phase de constitution des collections électroniques, les craintes et préoccupations tournaient surtout autour de la conservation des contenus acquis. Avec deux décennies de recul, nous faisons maintenant face à une croissance fulgurante des contenus, et nous réalisons que la dématérialisation a des conséquences écologiques bien réelles. Nous évoquons régulièrement « l’impact du numérique » dans les discussions et décisions, sans pour autant avoir une vision claire et concrète de ce que cela implique dans le domaine des ressources électroniques en particulier. Cet article vise à dresser un état des lieux et apporter une réflexion sur l’impact écologique concret des ressources électroniques, afin d’identifier d’éventuelles pistes d’actions. -- The extent of the environmental impact of electronic resources remains unclear and complex. Previously, when libraries were in the process of building their electronic collections, concerns revolved mainly around the long-term preservation of acquired content. Two decades later, we are now experiencing exponential growth in content and are realizing that dematerialization has very real environmental consequences. We regularly mention the "impact of digital technology" in discussions and decision-making, with little understanding of what this implies in the field of electronic resources in particular. This article aims to take stock of the current situation and reflect on the concrete environmental impacts of electronic resources, with a view to identifying possible courses of action.
Article
Full-text available
This study attempts to detect papers originating from the Russia‐based paper mill ‘International Publisher’ LLC. A total of 1,063 offers to purchase co‐authorship on a fraudulent papers published from 2019 to mid‐2022 on the 123mi.ru website were analysed. This study identifies at least 451 papers that are potentially linked to the paper mill, including one preprint, a duplication paper and 16 republications of papers erroneously published in hijacked journals. Evidence of suspicious provenance from the paper mill is provided: matches in title, number of co‐authorship slots, year of publication, country of the journal, country of a co‐authors and similarities of abstracts. These problematic papers are co‐authored by scholars from at least 39 countries and are submitted to both predatory and reputable journals. This study also demonstrates collaboration anomalies in questionable papers and examines indicators of the Russia‐based paper mill. The value of co‐authorship slots offered by ‘International Publisher’ LLC from 2019 to 2021 is estimated at $6.5 million. Since this study only analysed a single paper mill, it is likely that the number of papers with forged authorship is much higher.
Article
Full-text available
It is widely perceived how research institutes have been adopting the discourse of champions of diversity, inclusion, and equity (DEI) in recent years. Despite progress in diversity and inclusion in the academic environment, we highlight here that nothing or, at very best, little work has been done to overcome the scientific labor division in academic research that promotes neocolonial practices in academic recognition and jeopardizes equity. In this piece, we bring secondary data that reinforce biased patterns in academic recognition between Global North and South (geographical markers and citation bias), and propose three actions that should be adopted by researchers, research institutes, journals, and scientific societies from the Global North that allows for a fairer recognition of the academic expertise produced by the Global South.
Article
Full-text available
Objectives To describe retracted papers originating from paper mills, including their characteristics, visibility, and impact over time, and the journals in which they were published. Design Cross sectional study. Setting The Retraction Watch database was used for identification of retracted papers from paper mills, Web of Science was used for the total number of published papers, and data from Journal Citation Reports were collected to show characteristics of journals. Participants All paper mill papers retracted from 1 January 2004 to 26 June 2022 were included in the study. Papers bearing an expression of concern were excluded. Main outcome measures Descriptive statistics were used to characterise the sample and analyse the trend of retracted paper mill papers over time, and to analyse their impact and visibility by reference to the number of citations received. Results 1182 retracted paper mill papers were identified. The publication of the first paper mill paper was in 2004 and the first retraction was in 2016; by 2021, paper mill retractions accounted for 772 (21.8%) of the 3544 total retractions. Overall, retracted paper mill papers were mostly published in journals of the second highest Journal Citation Reports quartile for impact factor (n=529 (44.8%)) and listed four to six authors (n=602 (50.9%)). Of the 1182 papers, almost all listed authors of 1143 (96.8%) paper mill retractions came from Chinese institutions and 909 (76.9%) listed a hospital as a primary affiliation. 15 journals accounted for 812 (68.7%) of 1182 paper mill retractions, with one journal accounting for 166 (14.0%). Nearly all (n=1083, 93.8%) paper mill retractions had received at least one citation since publication, with a median of 11 (interquartile range 5-22) citations received. Conclusions Papers retracted originating from paper mills are increasing in frequency, posing a problem for the research community. Retracted paper mill papers most commonly originated from China and were published in a small number of journals. Nevertheless, detected paper mill papers might be substantially different from those that are not detected. New mechanisms are needed to identify and avoid this relatively new type of misconduct.
Article
Full-text available
The extent to which predatory journals can harm scientific practice increases as the numbers of such journals expand, in so far as they undermine scientific integrity, quality, and credibility, especially if those journals leak into prestigious databases. Journal Citation Reports (JCRs), a reference for the assessment of researchers and for grant-making decisions, is used as a standard whitelist, in so far as the selectivity of a JCR-indexed journal adds a legitimacy of sorts to the articles that the journal publishes. The Multidisciplinary Digital Publishing Institute (MDPI) once included on Beall’s list of potential, possible or probable predatory scholarly open-access publishers, had 53 journals ranked in the 2018 JCRs annual report. These journals are analysed, not only to contrast the formal criteria for the identification of predatory journals, but taking a step further, their background is also analysed with regard to self-citations and the source of those self-citations in 2018 and 2019. The results showed that the self-citation rates increased and was very much higher than those of the leading journals in the JCR category. Besides, an increasingly high rate of citations from other MDPI-journals was observed. The formal criteria together with the analysis of the citation patterns of the 53 journals under analysis all singled them out as predatory journals. Hence, specific recommendations are given to researchers, educational institutions and prestigious databases advising them to review their working relations with those sorts of journals.
Article
Full-text available
Peer review of manuscripts is labour‐intensive and time‐consuming. Individual reviewers might feel themselves overburdened with the amount of reviewing they are requested to do. Aiming to explore how stakeholder groups perceive reviewing burden and what they believe to be the causes of a potential overburdening of reviewers, we conducted focus groups with early‐, mid‐, and senior career scholars, editors, and publishers. By means of a thematic analysis, we aimed to identify the causes of overburdening of reviewers. First, we show that, across disciplines and roles, stakeholders believed that the reviewing burden is distributed unequally across members of the academic community, resulting in the overburdening of small groups of reviewers. Second, stakeholders believed this to be caused by (i) an increase in manuscript submissions; (ii) inefficient manuscript handling; (iii) lack of institutionalization of peer review; (iv) lack of reviewing instructions and (v) inadequate reviewer recruiting strategies. These themes were assumed to relate to an inadequate incentive structure in academia that favours publications over peer review. In order to alleviate reviewing burden, a holistic approach is required that addresses both the increased demand for and the insufficient supply of reviewing resources.
Article
The Journal Impact Factor and other indicators that assess the average citation rate of articles in a journal are consulted by many academics and research evaluators, despite initiatives against overreliance on them. Undermining both practices, there is limited evidence about the extent to which journal impact indicators in any field relate to human judgements about the quality of the articles published in the field’s journals. In response, we compared average citation rates of journals against expert judgements of their articles in all fields of science. We used preliminary quality scores for 96,031 articles published 2014–18 from the UK Research Excellence Framework 2021. Unexpectedly, there was a positive correlation between expert judgements of article quality and average journal citation impact in all fields of science, although very weak in many fields and never strong. The strength of the correlation varied from 0.11 to 0.43 for the 27 broad fields of Scopus. The highest correlation for the 94 Scopus narrow fields with at least 750 articles was only 0.54, for Infectious Diseases, and there was only one negative correlation, for the mixed category Computer Science (all), probably due to the mixing. The average citation impact of a Scopus-indexed journal is therefore never completely irrelevant to the quality of an article but is also never a strong indicator of article quality. Since journal citation impact can at best moderately suggest article quality it should never be relied on for this, supporting the San Francisco Declaration on Research Assessment.
Article
Some editors try to artificially inflate their journals' citation count by coercing authors, telling them to add citations referencing their journal even though the review process did not identify any bibliographical shortcomings. However, coercing authors for citations does not, by itself, inflate a journal's citation count; for coercion to be effective, authors must comply with the editor's demands and add those superfluous citations. In this study, we suggest that editors might use their publication authority to sort by or motivate compliance by accepting manuscripts of authors who acquiesce and rejecting studies by those who do not. Data was collected by conducting a survey of academics and includes responses of over 1000 scholars who have been coerced, our results suggest that acquiescence is positively associated with the publication decision, authors who added the coerced citations report significantly greater publication success than those who resist. In addition, we find that authors who acquiesce to coercion also report being more likely to submit to coercive journals in the future and to add superfluous, journal-specific citations before submitting manuscripts. We close with a brief discussion about the ethics of coercion and policy changes that can help reduce these abuses.
Article
Peer review continues to be a crucial part of the scientific publishing process. Many editors have reported difficulty recruiting potential reviewers and receiving timely recommendations. Poor reviewer acceptance and completion rates can complicate and delay publication. However, few studies have examined these rates in detail. Here we analyze reviewer invitation, acceptance, and completion data from the Baylor University Medical Proceedings.
Article
Some publishers say they are battling industrialized cheating. A Nature analysis examines the 'paper mill' problem — and how editors are trying to cope. Some publishers say they are battling industrialized cheating. A Nature analysis examines the 'paper mill' problem — and how editors are trying to cope.