ArticlePDF Available

Abstract

Does an author’s name affect their chances of being cited? Here, Philipp Otto and Philipp Otto - yes, two researchers with the same name - investigate the impact of academic authorship characteristics on article citations
Being cited is usually good news,
even if the reference is critical. The
number of citations reflects the
scientific impact of an article and
oen provides an initial assessment of the
quality of the publication cited, its author,
or the publishing journal. Thus, it is not
surprising that journal rankings and, thus,
academic success are increasingly based
on citation counts. However, various article
characteristics that do not necessarily relate
to the quality of an article can influence
citation counts.
For example, you are currently reading
an article by two researchers with the
same name who, when we first met,
worked at the same university, albeit
in dierent departments (statistics and
microeconomics). A famous similar
example is the collaboration of George
Box and Sir David Cox,1 who are said to
have worked together on their “analysis of
transformations” because of the similarity
of their names. Does the unusual nature of
the authorship mean you are more likely to
remember this article? And, if you are, does
that also mean you are more prone to cite it
in your work?
Such arbitrary influences on citation counts
are important to identify, because they can
distort the relationship between scientific
impact measures and research quality.
Moreover, these arbitrary factors may depend
on the readers’ perspective – for example,
perception of names may dier according
to readers’ own cultural backgrounds. So a
reasonable question is, how much variability
in citation frequencies can be explained by
other easily identifiable article dierentials?
To critically examine this universal evaluation
practice, in which quality assessments are
largely based on citation counts only, in a
separate article we analyse potential citation
count dependencies for various examples of
article and authorship characteristics for the
top publications in economics, psychology
and statistics.2
Our empirical analysis was based on all
articles published in the 115 top journals
in economics, psychology and statistics as
determined by the SCImago journal ranking.
What’s in aname?
Does an author’s name aect their chances of being cited? Here, Philipp Otto and Philipp Otto – yes, two
researchers with the same name – investigate the impact of academic authorship characteristics on article citations
SIGNIFICANCE
34 February 2023 © 2023 The Royal Statistical Society
Downloaded from https://academic.oup.com/jrssig/article/20/1/34/7034190 by guest on 13 February 2023
Philipp Otto investigates new forms of work
at the Department of Work and Organizational
Psychology at the University Bamberg and is
lecturer at the Medical School Brandenburg as
well as the European University Viadrina.
Philipp Otto is junior professor
of big geospatial data at
the Institute of Cartography
and Geoinformatics, Leibniz
University Hannover, Germany.
For each publication published in these
journals between 1990 and 2016, various
article authorship characteristics and the
total number of citations until November
2017 were retrieved from Microso Academic
Search. All analyses reported here are based
on this data set, unless other sources are
explicitly indicated.
To account for the excess of zero citations,
a zero-inflated negative binomial model was
used as it provided a comparatively better fit
than comparable models (e.g., a zero-inflated
Poisson model). Thus, we could distinguish
between two dierent eects – determinants
causing an excess of zero citations and
influences associated with an increase/
decrease in the total number of citations.
Regression coeicients were estimated from
the model using the maximum likelihood
approach for a generalised linear model,
which is computationally implemented by
Zeileis et al.3 In general, overdispersion as
implied by the negative binomial model is a
commonly observed phenomenon of citation
counts or numbers of publications.
Does alphabetical ordering
matter?
The order in which authors are listed in
joint publications is a prominent example
of authorship-related extrinsic article
characteristics aecting citation counts.
Some journals prefer purely alphabetical
order, while others list authors in descending
order of the amount of work contributed (i.e.,
the first author would be the main author of
the article).
Interestingly, a positive correlation
between surname initials and the scientific
success of the author has been reported by
Einav and Yariv4 for US researchers at top
institutions in economics, and by van Praag
and van Praag5 analysing the total number of
publications, although the initial letter of an
author’s surname in the alphabet is in no way
related to the quality of the research of that
person – at least as far we know. However, it
can be assumed that even when authors are
listed alphabetically, the first author is most
prominently attached to the article.
Still, the probability of being the first
author by chance depends on the overall
frequency of names for each initial. Therefore
we assess the percentage of papers with
alphabetically sorted authors in Figure 1,
which shows the percentage of alphabetical
orders for all papers with two authors (only
considering the first letters of the surnames).
Research disciplines clearly vary with regard
to their author listing practices. Here, we
selected the journals based on the SCImago
ranking, which follows the Scopus definition
of the disciplines.
Apparently, statistics exceeds this baseline
the most, indicating that many authors
intentionally ordered their surnames
alphabetically. By contrast, psychology
FIGURE 1: Proportions of articles with two authors having alphabetically ordered names separated by
the initial letter of the first author in economics (red), statistics (green), and psychology (blue). The bold
line depicts the probability of two random surnames being in alphabetical order (based on the observed
first‑letter distribution of our sample).2
FIGURE 2: Estimated number of citations against the number of authors (up to six) in economics, statistics
and psychology. Alphabetical ordering in blue and non‑alphabetical ordering in red.2
shutjane/Bigstock.com
Citations
35
February 2023 significancemagazine.com
Downloaded from https://academic.oup.com/jrssig/article/20/1/34/7034190 by guest on 13 February 2023
appears to be the discipline where the list of
authors is mostly ordered by the amount of
work contributed (i.e., the percentages are
close to the baseline probabilities). Listing
the authors non-alphabetically clearly
indicates that the first author contributed
the most, and thus is more prominently
associated with the work and potentially
recognised more oen. Having a surname
beginning with an early alphabetical letter
could bring an advantage in academia, as
the probability of being the first author is
higher. We need to note here that articles
where the main author appears as the first
author are actually cited significantly less
than articles with purely alphabetical orders.
This order preference eect might result
from dierences in academic tradition, and
could represent things other than quality
(dominance, conservatism, etc.). While
publications in statistics are less inflated
by non-cited articles when compared with
economics, these articles are less oen cited
(compare Figure 2). According to the results
of the count model, the total numbers of
citations in economics were also higher than
for publications in psychology. Consequently,
economics attracted the most citations
compared to the two other disciplines, but
statistics is less inflated by zero citations. To
summarise, citation counts partly result from
dierences in academic practices and specific
author characteristics can relate to these.
Does the number of authors
matter?
The number of authors can be considered
a structural influence on citation counts in
terms of information spreading. Naturally, it
takes time for an article to be cited and for
the academic community to acknowledge
new work. With more authors, the potential
for being cited, and the speed with which it
happens, increase: new information spreads
faster – not to mention the fact that direct
(or reciprocal) self-citations lead to higher
citation counts. Therefore, not surprisingly,
a positive eect on citation counts can be
observed for the number of authors, which
increases the likelihood of being cited as well
as the number of citations (with an expected
chance of one further citation being 1.011
times higher with each additional author).
For all disciplines, we observe a jump
in citations for papers by more than one
author – this can be seen by the additional
negative eect of an article having a single
author (included as indicator variable),
for which the incidence ratio is 0.75. This
relation between the average number of
citations and the number of authors is
illustrated in Figure 2. The relation between
the number of authors and citation counts
is straightforward and consistent over all
research disciplines investigated, with a
specific underrepresentation of single-
authored publications.
Does name-sharing matter?
The sharing of surnames is of specific interest
to us, as it was to ‘A few Goodmen’ in 2015.6
The results obtained for the influence of
author name-sharing on citation counts
were surprising, as the relationship is in the
opposite direction to that expected: articles
having the same surnames (for some of the
authors) were actually cited less oen, and in
cases where they were cited, their citations
were fewer in number (see Figure 3).
Since the number of papers with name-
sharing authors is significantly lower, the
estimated eects are less precise, which is
also reflected in the width of the prediction
intervals. Nevertheless, the estimated
eects are significant. Hence, recognition
simplicity does not improve citation counts.
A possible explanation here is that this
influence of recognising an article is largely
covered by the popularity of the first author’s
surname, and more frequent names more
oen lead to co-authors sharing their
surnames. Having a common author name,
as a first author surname characteristic,
has consistent positive eects on both the
likelihood of being cited and its frequency.
It is important to mention that the surname
being a common name was based on the
exact spelling, rather than on pronunciation.
Both variables (i.e., correct spelling and
pronunciation of common surnames) have
significant eects, but the exact spelling was
associated with a better model fit. Thus, the
unique popular spelling of a name appears
crucial for recall simplicity.
That said, other demographic or personal
author characteristics might help to further
elaborate on this kind of relationship.
Reasons beyond surname popularity for
sharing the authorship might not improve the
quality of the paper or at least its perception
by the scientific community.
Does the title length matter?
Furthermore, other article-specific
characteristics influence citation counts
FIGURE 3: Authors sharing their name or not in economics, psychology and statistics, with the predicted
number of citations and their 95% prediction intervals obtained from 1,000 Monte Carlo simulations.2
SIGNIFICANCE
36 February 2023
Downloaded from https://academic.oup.com/jrssig/article/20/1/34/7034190 by guest on 13 February 2023
which are not necessarily related to research
quality. For example, the longer the title of an
article the lower the citation counts (compare
Figure 4). This could result from information
processing, where, for example, the length
of the title of an article serves as an indicator
for recall simplicity. Bounded rationality,
in the form of limitations when recalling
more complex article titles, could lead to
lower citation counts, and simplicity might
help recognition here as a basis for citation
behaviour. Clearly, a direct dependence on
the length of the title cannot be generally
assumed for the quality of the overall article
and research. Content analysis is also clearly
a matter of interest here.
Citations: what really counts
We can draw conclusions here on the
application of citation indices as a universal
heuristic for informing decision-making
on various levels: because of its growing
importance for determining academic success,
more needs to be done to better understand
the various facets of citation behaviour. These
new results on arbitrary article characteristics
fuel the long-running debate about what the
number of citations can tell us and whether it
can be used as a reliable measure of research
quality. Let us add the following arguments to
the discussion.
When analysing articles’ citation counts, we
need to account for structural regularities,
such as the field of research, the number of
authors, and the duration since publication.
Citation frequencies can clearly be
expected to increase with the years since
publication and the number of authors.
Furthermore, authorship characteristics
remain influential for citation frequencies,
even when controlling for structural
regularities such as time and discipline.
With diverse drivers influencing citation
frequencies, citation counts need to
be treated more cautiously. If specific
authorship characteristics are influencing
the citation process, and various data
sources exist (e.g. Google Scholar,
PubMed, Web of Science) to evaluate the
dependencies here, then these can easily
be detected and controlled to better
inform decisions. Additional proxies
for research quality are thus needed to
supplement citation indices and journal
ranks. Alternative research quality
measures, beyond “counting citations”,
could also lead to innovative research
result presentations. This should include
the whole scientific process from stating
the idea up to the acknowledgement in
the research community. For instance,
supplementing traditional theoretical
scientific reasoning with interactive
elements (like graphical representations
allowing the user to change some input
values) could stimulate the scientific
discourse via peer understanding,
evaluation and discussion.
This is an appeal to increase transparency
and reproducibility to foster a stringent
scientific validation and reflection to
increase the general interpretability of
scientific results. In our understanding,
this runs counter to the practice of using
social media to increase short-term impacts
by boosting citation counts. Knowledge
transfers to the public are important
in science, but need to be based on
transparent and solid scientific grounds,
which include all steps of information
generation and spread. Barriers such as
article processing charges and open-access
fees further restrict the knowledge transfer
to society. Without solid publication
indices, easily accessible data, open
research documentation, open-access
online publications and, most importantly,
competent scholars achieving these
aspirations, we would all be increasingly
le in a cloud of fuzzy publication metrics.
References
1. Box, G. E. P. and Cox, D. R. (1964) An analysis of
transformations. Journal of the Royal Statistical Society,
Series B, 26(2), 211–252.
2. Otto, P. and Otto, P. (2022) Impact of academic
authorship characteristics on article citations. REVSTAT,
20(4), 427–447.
3. Zeileis, A., Kleiber, C. and Jackman, S. (2008)
Regression models for count data in R. Journal of
Statistical Soware, 27(8).
4. Einav, L. and Yariv, L. (2006) What’s in a surname?
The eects of surname initials on academic success.
Journal of Economic Perspectives, 20(1), 175–187.
5. Van Praag, C. M. and van Praag, B. (2008) The benets
of being economics professor A (rather than Z).
Economica, 75(300), 782–796.
6. Goodman, A. C., Goodman, J., Goodman, L. and
Goodman, S. (2015) A few Goodmen: Surname-sharing
economist coauthors. Economic Inquiry, 53(2), 1392–1395.
Title length in words
FIGURE 4: Predicted number of citations by title length for dierent numbers of authors for economics,
statistics and psychology and their 95% prediction intervals obtained from 1,000 Monte Carlo simulations of
the estimated model. Note that the intervals are much narrower for papers with more than one author.2
Citations
37
February 2023 significancemagazine.com
Downloaded from https://academic.oup.com/jrssig/article/20/1/34/7034190 by guest on 13 February 2023
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Scientific self-evaluation practices are increasingly built on citation counts. Citation practices for the top journals in economics, psychology, and statistics illustrate article characteristics that influence citation frequencies. Citation counts differ between the investigated disciplines, with economics attracting the most citations and statistics the least. Although articles in statistics are cited less frequently, its proportion of uncited articles is the smallest of all three disciplines. Academic authorship characteristics clearly influence the number of citations. Having authors alphabetically ordered, a practice differently present in the investigated disciplines, increases citations. Further, the more authors there are, the more the article is cited, and a first author with a common surname has positive effects on citation counts, whereas two or more authors sharing a surname attracts fewer citations. In addition, the shorter the article's title, the higher the number of citations.
Article
Full-text available
The classical Poisson, geometric and negative binomial regression models for count data belong to the family of generalized linear models and are available at the core of the statistics toolbox in the R system for statistical computing. After reviewing the conceptual and computational features of these methods, a new implementation of hurdle and zero-inflated regression models in the functions hurdle() and zeroinfl() from the package pscl is introduced. It re-uses design and functionality of the basic R functions just as the underlying conceptual tools extend the classical models. Both hurdle and zero-inflated model, are able to incorporate over-dispersion and excess zeros-two problems that typically occur in count data sets in economics and the social sciences—better than their classical counterparts. Using cross-section data on the demand for medical care, it is illustrated how the classical as well as the zero-augmented models can be fitted, inspected and tested in practice.
Article
Full-text available
In this paper, we focus on the effects of surname initials on professional outcomes in the academic labor market for economists. We begin our analysis with data on faculty in all top 35 U.S. economics departments. Faculty with earlier surname initials are significantly more likely to receive tenure at top ten economics departments, are significantly more likely to become fellows of the Econometric Society, and, to a lesser extent, are more likely to receive the Clark Medal and the Nobel Prize. These statistically significant differences remain the same even after we control for country of origin, ethnicity, religion or departmental fixed effects. As a test, we replicate our analysis for faculty in the top 35 U.S. psychology departments, for which coauthorships are not normatively ordered alphabetically. We find no relationship between alphabetical placement and tenure status in psychology. We suspect the "alphabetical discrimination" reported in this paper is linked to the norm in the economics profession prescribing alphabetical ordering of credits on coauthored publications. We also investigate the extent to which the effects of alphabetical placement are internalized by potential authors in their choices to work with different numbers of coauthors as well as in their willingness to follow the alphabetical ordering norm.
Article
In the analysis of data it is often assumed that observations y1, y2, …, yn are independently normally distributed with constant variance and with expectations specified by a model linear in a set of parameters θ. In this paper we make the less restrictive assumption that such a normal, homoscedastic, linear model is appropriate after some suitable transformation has been applied to the y's. Inferences about the transformation and about the parameters of the linear model are made by computing the likelihood function and the relevant posterior distribution. The contributions of normality, homoscedasticity and additivity to the transformation are separated. The relation of the present methods to earlier procedures for finding transformations is discussed. The methods are illustrated with examples.
Article
We explore the phenomenon of coauthorship by economists who share a surname. Prior research has included at most three economist coauthors who share a surname. Ours is the first paper to have four economist coauthors who share a surname, as well as the first where such coauthors are unrelated by marriage, blood, or current campus. (JEL Y9)
Article
Alphabetical name ordering on multi-authored academic papers, which is the convention in economics and various other disciplines, is to the advantage of people whose last name initials are placed early in the alphabet. Professor A, who has been a first author more often than Professor Z, will have published more articles and experienced a faster productivity rate over the course of her career as a result of reputation and visibility. Authors know that name ordering matters and take ordering seriously. Several characteristics of an author-group composition determine the decision to deviate from the default alphabetical name order to a significant extent. Copyright (c) The London School of Economics and Political Science 2007.
What's in a surname?
  • L Einav
  • L Yariv
Einav, L. and Yariv, L. (2006) What's in a surname?
Title length in words FIGURE 4: Predicted number of citations by title length for different numbers of authors for economics, statistics and psychology and their 95% prediction
  • A C Goodman
  • J Goodman
  • L Goodman
  • S Goodman
Goodman, A. C., Goodman, J., Goodman, L. and Goodman, S. (2015) A few Goodmen: Surname-sharing economist coauthors. Economic Inquiry, 53(2), 1392-1395. Title length in words FIGURE 4: Predicted number of citations by title length for different numbers of authors for economics, statistics and psychology and their 95% prediction intervals obtained from 1,000