Content uploaded by Philipp Otto
Author content
All content in this area was uploaded by Philipp Otto on Feb 21, 2023
Content may be subject to copyright.
Being cited is usually good news,
even if the reference is critical. The
number of citations reflects the
scientific impact of an article and
oen provides an initial assessment of the
quality of the publication cited, its author,
or the publishing journal. Thus, it is not
surprising that journal rankings and, thus,
academic success are increasingly based
on citation counts. However, various article
characteristics that do not necessarily relate
to the quality of an article can influence
citation counts.
For example, you are currently reading
an article by two researchers with the
same name who, when we first met,
worked at the same university, albeit
in dierent departments (statistics and
microeconomics). A famous similar
example is the collaboration of George
Box and Sir David Cox,1 who are said to
have worked together on their “analysis of
transformations” because of the similarity
of their names. Does the unusual nature of
the authorship mean you are more likely to
remember this article? And, if you are, does
that also mean you are more prone to cite it
in your work?
Such arbitrary influences on citation counts
are important to identify, because they can
distort the relationship between scientific
impact measures and research quality.
Moreover, these arbitrary factors may depend
on the readers’ perspective – for example,
perception of names may dier according
to readers’ own cultural backgrounds. So a
reasonable question is, how much variability
in citation frequencies can be explained by
other easily identifiable article dierentials?
To critically examine this universal evaluation
practice, in which quality assessments are
largely based on citation counts only, in a
separate article we analyse potential citation
count dependencies for various examples of
article and authorship characteristics for the
top publications in economics, psychology
and statistics.2
Our empirical analysis was based on all
articles published in the 115 top journals
in economics, psychology and statistics as
determined by the SCImago journal ranking.
What’s in aname?
Does an author’s name aect their chances of being cited? Here, Philipp Otto and Philipp Otto – yes, two
researchers with the same name – investigate the impact of academic authorship characteristics on article citations
SIGNIFICANCE
34 February 2023 © 2023 The Royal Statistical Society
Downloaded from https://academic.oup.com/jrssig/article/20/1/34/7034190 by guest on 13 February 2023
Philipp Otto investigates new forms of work
at the Department of Work and Organizational
Psychology at the University Bamberg and is
lecturer at the Medical School Brandenburg as
well as the European University Viadrina.
Philipp Otto is junior professor
of big geospatial data at
the Institute of Cartography
and Geoinformatics, Leibniz
University Hannover, Germany.
For each publication published in these
journals between 1990 and 2016, various
article authorship characteristics and the
total number of citations until November
2017 were retrieved from Microso Academic
Search. All analyses reported here are based
on this data set, unless other sources are
explicitly indicated.
To account for the excess of zero citations,
a zero-inflated negative binomial model was
used as it provided a comparatively better fit
than comparable models (e.g., a zero-inflated
Poisson model). Thus, we could distinguish
between two dierent eects – determinants
causing an excess of zero citations and
influences associated with an increase/
decrease in the total number of citations.
Regression coeicients were estimated from
the model using the maximum likelihood
approach for a generalised linear model,
which is computationally implemented by
Zeileis et al.3 In general, overdispersion as
implied by the negative binomial model is a
commonly observed phenomenon of citation
counts or numbers of publications.
Does alphabetical ordering
matter?
The order in which authors are listed in
joint publications is a prominent example
of authorship-related extrinsic article
characteristics aecting citation counts.
Some journals prefer purely alphabetical
order, while others list authors in descending
order of the amount of work contributed (i.e.,
the first author would be the main author of
the article).
Interestingly, a positive correlation
between surname initials and the scientific
success of the author has been reported by
Einav and Yariv4 for US researchers at top
institutions in economics, and by van Praag
and van Praag5 analysing the total number of
publications, although the initial letter of an
author’s surname in the alphabet is in no way
related to the quality of the research of that
person – at least as far we know. However, it
can be assumed that even when authors are
listed alphabetically, the first author is most
prominently attached to the article.
Still, the probability of being the first
author by chance depends on the overall
frequency of names for each initial. Therefore
we assess the percentage of papers with
alphabetically sorted authors in Figure 1,
which shows the percentage of alphabetical
orders for all papers with two authors (only
considering the first letters of the surnames).
Research disciplines clearly vary with regard
to their author listing practices. Here, we
selected the journals based on the SCImago
ranking, which follows the Scopus definition
of the disciplines.
Apparently, statistics exceeds this baseline
the most, indicating that many authors
intentionally ordered their surnames
alphabetically. By contrast, psychology
FIGURE 1: Proportions of articles with two authors having alphabetically ordered names separated by
the initial letter of the first author in economics (red), statistics (green), and psychology (blue). The bold
line depicts the probability of two random surnames being in alphabetical order (based on the observed
first‑letter distribution of our sample).2
FIGURE 2: Estimated number of citations against the number of authors (up to six) in economics, statistics
and psychology. Alphabetical ordering in blue and non‑alphabetical ordering in red.2
shutjane/Bigstock.com
Citations
35
February 2023 significancemagazine.com
Downloaded from https://academic.oup.com/jrssig/article/20/1/34/7034190 by guest on 13 February 2023
appears to be the discipline where the list of
authors is mostly ordered by the amount of
work contributed (i.e., the percentages are
close to the baseline probabilities). Listing
the authors non-alphabetically clearly
indicates that the first author contributed
the most, and thus is more prominently
associated with the work and potentially
recognised more oen. Having a surname
beginning with an early alphabetical letter
could bring an advantage in academia, as
the probability of being the first author is
higher. We need to note here that articles
where the main author appears as the first
author are actually cited significantly less
than articles with purely alphabetical orders.
This order preference eect might result
from dierences in academic tradition, and
could represent things other than quality
(dominance, conservatism, etc.). While
publications in statistics are less inflated
by non-cited articles when compared with
economics, these articles are less oen cited
(compare Figure 2). According to the results
of the count model, the total numbers of
citations in economics were also higher than
for publications in psychology. Consequently,
economics attracted the most citations
compared to the two other disciplines, but
statistics is less inflated by zero citations. To
summarise, citation counts partly result from
dierences in academic practices and specific
author characteristics can relate to these.
Does the number of authors
matter?
The number of authors can be considered
a structural influence on citation counts in
terms of information spreading. Naturally, it
takes time for an article to be cited and for
the academic community to acknowledge
new work. With more authors, the potential
for being cited, and the speed with which it
happens, increase: new information spreads
faster – not to mention the fact that direct
(or reciprocal) self-citations lead to higher
citation counts. Therefore, not surprisingly,
a positive eect on citation counts can be
observed for the number of authors, which
increases the likelihood of being cited as well
as the number of citations (with an expected
chance of one further citation being 1.011
times higher with each additional author).
For all disciplines, we observe a jump
in citations for papers by more than one
author – this can be seen by the additional
negative eect of an article having a single
author (included as indicator variable),
for which the incidence ratio is 0.75. This
relation between the average number of
citations and the number of authors is
illustrated in Figure 2. The relation between
the number of authors and citation counts
is straightforward and consistent over all
research disciplines investigated, with a
specific underrepresentation of single-
authored publications.
Does name-sharing matter?
The sharing of surnames is of specific interest
to us, as it was to ‘A few Goodmen’ in 2015.6
The results obtained for the influence of
author name-sharing on citation counts
were surprising, as the relationship is in the
opposite direction to that expected: articles
having the same surnames (for some of the
authors) were actually cited less oen, and in
cases where they were cited, their citations
were fewer in number (see Figure 3).
Since the number of papers with name-
sharing authors is significantly lower, the
estimated eects are less precise, which is
also reflected in the width of the prediction
intervals. Nevertheless, the estimated
eects are significant. Hence, recognition
simplicity does not improve citation counts.
A possible explanation here is that this
influence of recognising an article is largely
covered by the popularity of the first author’s
surname, and more frequent names more
oen lead to co-authors sharing their
surnames. Having a common author name,
as a first author surname characteristic,
has consistent positive eects on both the
likelihood of being cited and its frequency.
It is important to mention that the surname
being a common name was based on the
exact spelling, rather than on pronunciation.
Both variables (i.e., correct spelling and
pronunciation of common surnames) have
significant eects, but the exact spelling was
associated with a better model fit. Thus, the
unique popular spelling of a name appears
crucial for recall simplicity.
That said, other demographic or personal
author characteristics might help to further
elaborate on this kind of relationship.
Reasons beyond surname popularity for
sharing the authorship might not improve the
quality of the paper or at least its perception
by the scientific community.
Does the title length matter?
Furthermore, other article-specific
characteristics influence citation counts
FIGURE 3: Authors sharing their name or not in economics, psychology and statistics, with the predicted
number of citations and their 95% prediction intervals obtained from 1,000 Monte Carlo simulations.2
SIGNIFICANCE
36 February 2023
Downloaded from https://academic.oup.com/jrssig/article/20/1/34/7034190 by guest on 13 February 2023
which are not necessarily related to research
quality. For example, the longer the title of an
article the lower the citation counts (compare
Figure 4). This could result from information
processing, where, for example, the length
of the title of an article serves as an indicator
for recall simplicity. Bounded rationality,
in the form of limitations when recalling
more complex article titles, could lead to
lower citation counts, and simplicity might
help recognition here as a basis for citation
behaviour. Clearly, a direct dependence on
the length of the title cannot be generally
assumed for the quality of the overall article
and research. Content analysis is also clearly
a matter of interest here.
Citations: what really counts
We can draw conclusions here on the
application of citation indices as a universal
heuristic for informing decision-making
on various levels: because of its growing
importance for determining academic success,
more needs to be done to better understand
the various facets of citation behaviour. These
new results on arbitrary article characteristics
fuel the long-running debate about what the
number of citations can tell us and whether it
can be used as a reliable measure of research
quality. Let us add the following arguments to
the discussion.
■ When analysing articles’ citation counts, we
need to account for structural regularities,
such as the field of research, the number of
authors, and the duration since publication.
Citation frequencies can clearly be
expected to increase with the years since
publication and the number of authors.
Furthermore, authorship characteristics
remain influential for citation frequencies,
even when controlling for structural
regularities such as time and discipline.
■ With diverse drivers influencing citation
frequencies, citation counts need to
be treated more cautiously. If specific
authorship characteristics are influencing
the citation process, and various data
sources exist (e.g. Google Scholar,
PubMed, Web of Science) to evaluate the
dependencies here, then these can easily
be detected and controlled to better
inform decisions. Additional proxies
for research quality are thus needed to
supplement citation indices and journal
ranks. Alternative research quality
measures, beyond “counting citations”,
could also lead to innovative research
result presentations. This should include
the whole scientific process from stating
the idea up to the acknowledgement in
the research community. For instance,
supplementing traditional theoretical
scientific reasoning with interactive
elements (like graphical representations
allowing the user to change some input
values) could stimulate the scientific
discourse via peer understanding,
evaluation and discussion.
■ This is an appeal to increase transparency
and reproducibility to foster a stringent
scientific validation and reflection to
increase the general interpretability of
scientific results. In our understanding,
this runs counter to the practice of using
social media to increase short-term impacts
by boosting citation counts. Knowledge
transfers to the public are important
in science, but need to be based on
transparent and solid scientific grounds,
which include all steps of information
generation and spread. Barriers such as
article processing charges and open-access
fees further restrict the knowledge transfer
to society. Without solid publication
indices, easily accessible data, open
research documentation, open-access
online publications and, most importantly,
competent scholars achieving these
aspirations, we would all be increasingly
le in a cloud of fuzzy publication metrics.
References
1. Box, G. E. P. and Cox, D. R. (1964) An analysis of
transformations. Journal of the Royal Statistical Society,
Series B, 26(2), 211–252.
2. Otto, P. and Otto, P. (2022) Impact of academic
authorship characteristics on article citations. REVSTAT,
20(4), 427–447.
3. Zeileis, A., Kleiber, C. and Jackman, S. (2008)
Regression models for count data in R. Journal of
Statistical Soware, 27(8).
4. Einav, L. and Yariv, L. (2006) What’s in a surname?
The eects of surname initials on academic success.
Journal of Economic Perspectives, 20(1), 175–187.
5. Van Praag, C. M. and van Praag, B. (2008) The benets
of being economics professor A (rather than Z).
Economica, 75(300), 782–796.
6. Goodman, A. C., Goodman, J., Goodman, L. and
Goodman, S. (2015) A few Goodmen: Surname-sharing
economist coauthors. Economic Inquiry, 53(2), 1392–1395.
Title length in words
FIGURE 4: Predicted number of citations by title length for dierent numbers of authors for economics,
statistics and psychology and their 95% prediction intervals obtained from 1,000 Monte Carlo simulations of
the estimated model. Note that the intervals are much narrower for papers with more than one author.2
Citations
37
February 2023 significancemagazine.com
Downloaded from https://academic.oup.com/jrssig/article/20/1/34/7034190 by guest on 13 February 2023