Working PaperPDF Available

The counting house, measuring those who count: Presence of Bibliometrics, Scientometrics, Informetrics, Webometrics and Altmetrics in Google Scholar Citations, ResearcherID, ResearchGate, Mendeley, & Twitter

Authors:
The counting house, measuring those who count:
Presence of Bibliometrics, Scientometrics, Informetrics, Webometrics and
Altmetrics in Google Scholar Citations, ResearcherID, ResearchGate,
Mendeley, & Twitter
Alberto Martín-Martín1, Enrique Orduna-Malea2, Juan M. Ayllón1 &
Emilio Delgado López-Cózar1
1 EC3 Research Group: Evaluación de la Ciencia y de la Comunicación Científica, Universidad de Granada (Spain)
2 EC3 Research Group: Evaluación de la Ciencia y de la Comunicación Científica, Universidad Politécnica de Valencia
(Spain)
ABSTRACT
Following in the footsteps of the model of scientific communication, which has recently gone through a metamorphosis (from
the Gutenberg galaxy to the Web galaxy), a change in the model and methods of scientific evaluation is also taking place. A
set of new scientific tools are now providing a variety of indicators which measure all actions and interactions among
scientists in the digital space, making new aspects of scientific communication emerge. In this work we present a method for
―capturing‖ the structure of an entire scientific community (the Bibliometrics, Scientometrics, Informetrics, Webometrics, an d
Altmetrics community) and the main agents that are part of it (scientists, documents, and sources) through the lens of
Google Scholar Citations (GSC).
Additionally, we compare these author ―portraits‖ to the ones offered by other profile or social platforms currently used by
academics (ResearcherID, ResearchGate, Mendeley, and Twitter), in order to test their degree of use, completeness,
reliability, and the validity of the information they provide. A sample of 814 authors (researchers in Bibliometrics with a public
profile created in GSC) was subsequently searched in the other platforms, collecting the main indicators computed by each
of them. The data collection was carried out on September, 2015. The Spearman correlation (α= 0.05) was applied to these
indicators (a total of 31), and a Principal Component Analysis was carried out in order to reveal the relationships among
metrics and platforms as well as the possible existence of metric clusters.
We found that it is feasible to depict an accurate representation of the current state of the Bibliometrics community using
data from GSC (the most influential authors, documents, journals, and publishers). Regarding the number of authors found
in each platform, GSC takes the first place (814 authors), followed at a distance by ResearchGate (543), which is currently
growing at a vertiginous speed. The number of Mendeley profiles is high, although 17.1% of them are basically empty.
ResearcherID is also affected by this issue (34.45% of the profiles are empty), as is Twitter (47% of the Twitter accounts
have published less than 100 tweets). Only 11% of our sample (93 authors) have created a profile in all the platforms
analyzed in this study. From the PCA, we found two kinds of impact on the Web: first, all metrics related to academic
impact. This first group can further be divided into usage metrics (views and downloads) and citation metrics. Second, all
metrics related to connectivity and popularity (followers). ResearchGate indicators, as well as Mendeley readers, present a
high correlation to all the indicators from GSC, but only a moderate correlation to the indicators in ResearcherID. Twitter
indicators achieve only low correlations to the rest of the indicators, the highest of these being to GSC (0.42-0.46), and to
Mendeley (0.41-0.46).
Lastly, we present a taxonomy of all the errors that may affect the reliability of the data contained in each of these platforms,
with a special emphasis in GSC, since it has been our main source of data. These errors alert us to the danger of blindly
using any of these platforms for the assessment of individuals, without verifying the veracity and exhaustiveness of the data.
In addition to this working paper, we also have made available a website where all the data obtained for each author and
the results of the analysis of the most cited documents can be found: Scholar Mirrors.
KEYWORDS
Google Scholar; Social media metrics; Bibliometrics; Altmetrics; Mendeley; ResearchGate, ResearcherID, Twitter;
Academic profiles.
60 pages, 12 tables, 35 figures
EC3’s Document Series:
EC3 Working Papers Nº 21
Document History
Version 1.0. 19th of January , 2016 Granada
Cite as
Martín-Martín, A.; Orduna-Malea, E.; Ayllón, J.M. & Delgado López-Cózar, E. (2016). ―The counting house:
measuring those who count. Presence of Bibliometrics, Scientometrics, Informetrics, Webometrics and Altmetrics in
the Google Scholar Citations, ResearcherID, ResearchGate, Mendeley & Twitter‖. EC3 Working Papers, 21. 19th of
January 2015. 60 pages, 12 tables, 35 figures.
Corresponding author
Emilio Delgado López-Cózar. edelgado@ugr.es
EC3 Working Papers Nº 21
2
1. INTRODUCTION
1.1. Disciplines and scientific communities: territories and the tribes of
Science
Science, in order to be properly investigated, grasped, and taught, has usually
been organized in various areas of knowledge. Over time, each of these areas
has been further divided into fields, subfields, disciplines, and specialties, as a
result of the ever faster growth of knowledge and the parallel increase in the
number of people who form the scientific communities within each of these
areas. This process of scientific budding follows the life cycle of a living being
(birth, growth, reproduction, and death), and is subject to endless
metamorphosis, each discipline displaying its own idiosyncrasies.
Each of these units in which scientific knowledge is structured has its own
epistemological properties (its object, its principles, and its methods) that endow
them with a characteristic identity as well as boundaries that demarcate their
cognitive territory. The inner and outer boundaries are not always clearly
defined. There is overlapping between disciplines, gaps, and loops, sometimes
quite vague and difficult to trace.
The different areas of knowledge are populated by communities of scientists
and professionals, each group using their own tools, methodologies and
techniques. These are social groups that share - with more or less consensus -
professional practices, forms of work organization, living conditions, social
expectations, principles, values, and beliefs.
Whitley (1984) dissected with a precision close to that of a surgeon‘s scalpel the
process by which the academic communities - and their disciplines and
specialties - become socially and cognitively institutionalized: how they create
organizations that allow them to associate in order to defend their interests, how
they erect spaces for the exchange of ideas and social development
(conferences, seminars, forums, etc.), how they institute professional
(newsletter, discussion list) or scientific (journals) means of communication,
how they obtain academic standing by teaching the subject at the university
(courses in graduate and postgraduate programs, including Master and PhD
degrees), how they create groups, departments, laboratories, and companies
dedicated to advance research, how they define research agendas where not
only research problems but also ways to address and solve them are
addressed, or how to create a common language to establish ideas and
principles. Not to mention that the process of social and cognitive
institutionalization of disciplines is directly influenced by the geographic location
and the different levels of economic and cultural development of the countries
where they are based.
As masterfully formulated by Becher and Trowler (2001), there is a close
relationship between the disciplines (territories of knowledge) and people who
advance them (scientific tribes); between the epistemic properties of the forms
of scientific knowledge and the social aspects of academic communities. This is
why any analysis of a discipline cannot ignore these two areas: the cognitive
EC3 Working Papers Nº 21
3
(disciplines) and social (community); you cannot understand one without the
other.
Therefore, the ultimate aim of this Working Paper is to portray a discipline
(Bibliometrics) and those who practice it, because a discipline is what is
performed by those who cultivate it. Consequently, identifying the members of
the Bibliometric tribe is one of the goals of this work.
1.2. A discipline with many names
There are numerous works which address the history of our field of knowledge
(Broadus 1987a; Hertzel 1987; Shapiro 1992; Godin, 2006; De Bellis 2009). Its
denomination, object of study, and scope have been addressed as well
(Lawani, 1981; Bonitz, 1982; Peritz, 1984; Broadus, 1987b; Brookes, 1988;
1990; Sengupta, 1992; Glänzel & Schoepflin 1994; Braun 1994, Gorbea, 1995;
Hood & Wilson, 2001; Cronin, 2001; Thelwall, 2008; Larriviere, 2012). There are
also several literature reviews about this subject (Narin & Moll, 1977; White &
McCain, 1989; Van Raan, 1997; Wilson, 1999; Borgman & Furner, 2002).
Bibliometrics can be synthetically defined as the discipline responsible for
measuring communication and, in enlarged form, as the specialty responsible
for quantitatively study the production, distribution, dissemination and
consumption of information conveyed in any type of document (book, journal,
conference, patent, or website) and any intellectual field, but with special
attention to scientific information. It is a discipline with peculiar features:
- It is a very young discipline: although rooted in the early twentieth century
in the library environment with the idea of measuring the production of
knowledge (bibliographic statistics) and to properly manage library
collections, it is not after World War II that Bibliometrics really starts to set
its foundations. Its epistemic fundamentals are still boiling (they are not
fully settled yet).
- It is a discipline best defined by its methods than by the thematic areas
covered (the so-called ―metrics‖: quantitative data analysis applying
various statistical techniques).
- It has a strong interdisciplinary character which arises from the
incorporation of methods and techniques developed in other fields, and by
its application to the study of any subject area. This makes Bibliometrics
an open discipline willing to be fertilized by ideas from the most diverse
origin and accept scientists from the most diverse disciplinary
environments. This is the reason why Bibliometrics resembles a
crossroads, a place where different scientific traditions meet.
The young age of the discipline and its interdisciplinary and instrumental
character is the reason why this discipline is known by many different names.
However, this fact does not mean the subject of study or the borders of the
discipline are not clearly defined. Rather, it is a sign of the coexistence of
different traditions that have shaped the development of the discipline.
EC3 Working Papers Nº 21
4
Bibliometrics is the original and most widespread name. It stems from the
bibliographic tradition represented by Paul Otlet with his proposal for a
"bibliometrie", a Science for measuring all the dimensions of books and other
documents, and from the library tradition concerned since ancient times about
measuring the growth of knowledge and usage of its holdings.
Scientometrics is oriented towards the quantitative analysis of scientific and
technical literature. It comes from the tradition of the science of science (space
of confluence of Sociology, History, and Philosophy of science), to which
science policy is also linked. It was crucial for this scientometric orientation the
creation of the Citation indexes (databases dedicated to the collection of
scientific production).
Informetrics is focused on the discovery of mathematical models that explain
the properties of information. It is connected with the modern information
science. It is a designation so close to Scientometrics that sometimes it is
difficult to find differences among them.
Webometrics and Altmetrics are the most recent denominations. They started to
gain momentum as the use of the new information and communication
technologies began to spread. They are being developed in the tradition of the
modern Library and Information Science, a discipline increasingly dedicated to
computer science and to computing itself. These new names are strongly
influenced by the medium in which information is conveyed rather than by the
content itself. They come also to highlight the traditional technological aspect
that the different metric specialties have enjoyed since their inception.
An analysis of the terms used in the titles of documents in our field published
between 1969 and 2015 and indexed in Google Scholar (Figure 1) shows a
clear predominance of the term Bibliometrics, followed by Scientometrics.
However, in the last three years the term Altmetrics is being increasingly used,
as a result of the novelty of the new social media communication technologies.
Figure 1. Number of results returned by Google Scholar for the terms Bibliometrics,
Scientometrics, Informetrics, Webometrics and Altmetrics contained only within the document
titles by year (1969-2015)
EC3 Working Papers Nº 21
5
A similar result is obtained when the keywords used by the 814 scientists
specialized in Bibliometrics or working sporadically in this field with a public
profile on Google Scholar Citations are analyzed (Figure 2). The prevalence of
Scientometrics and Bibliometrics is clear, although the weight of the latter would
be higher had the terms been properly standardized.
Figure 2. Word cloud of the keywords used by the researchers with a public Google Scholar
Citations profile analyzed in this product (size indicates frequency of use in the sample)
Furthermore, it is of great interest to know which other terms are used by
bibliometricians. The list of terms associated with the Library and Information
Science are very numerous, which shows how this discipline was the area
where Bibliometrics stemmed from. Similarly, the relationship with science and
technology studies (and specifically with science policy) is obvious. Lastly, there
are also many terms related to research evaluation and citation analysis.
1.3. New mirrors and meters of Science: new media and new metrics
There is no better way to learn about a discipline than analysing its scientific
literature. The best mirror of a scientific discipline is precisely the intellectual
production that its academic community generates. This is the assumption in
which Bibliometrics is based when it is used to examine the traits that define
other disciplines and specialties.
Knowing the scope of a discipline will not only help characterize and determine
its perspective and scientific nature, but it will also indirectly delineate its
internal structure, its coherence, its contours, and its location in the overall
picture of Sciences. This will enable an understanding of what the research is
and has been about in a particular discipline, and how it may evolve in the
future.
EC3 Working Papers Nº 21
6
Today the number of venues in which research results produced in any
discipline are published has been remarkably increased. The Gutenberg
paradigm‖, which limited research products to the printed world (and more
specifically to the journal, the main communication channel), has been
challenged since the end of the twentieth century by a plethora of new channels
of communication that are created, indexed, searched, located, read, and
mentioned in the shared hyperspace (Castells, 2002). All this, of course, made
possible by the development and worldwide use of the Internet, and the social
web in particular. These are the new mirrors where the disciplines and
communities are reflected. Revealing and evaluating the role of these new
channels in Bibliometrics is another goal of this paper.
Following in the footsteps of the model of scientific communication, which has
recently gone through a metamorphosis (from the Gutenberg galaxy to the web
galaxy), a change in the model and methods of scientific evaluation is also
taking place. The new media, due to its electronic nature, are supplied with
multiple indicators measuring all actions and interactions among scientists in
the digital space. In this work we open the door to new platform providers of
metric indicators (whose nature is still unknown because of its youth) and snoop
inside to see what they tell us about the various facets of scientific
communication, complementing in this way some recent works in the topic
(Jamali, Nicholas & Herman, 2015; Mikki et al, 2015), where not only the
potential of these new mirrors but also their limitations and perceptions are
considered.
We intend to bring attention to some of these new metrics and look into their
meaning. In this way we position ourselves in the debate about "Altmetrics", but
using a different perspective: the perspective of individuals and not just the
documents they produce. We observe what these new metrics measure by
taking as the object of study precisely those researchers who measure others
(bibliometricians). In short, Bibliometrics, and those who measure, are
measured.
Following our research line oriented on discovering the inner depths of Google
Scholar while testing its suitability as a tool for research evaluation, this time we
have turned our efforts to investigate new uses for Google Scholar Citations
(sometimes also known as Google Scholar Profiles). We present in this new
Working Paper a method to learn about the impact of an entire scientific
specialty: a very specific scientific and professional community (the
Bibliometrics, Scientometrics, Informetrics, Webometrics, and Altmetrics
community), and the main agents that are part of it (scientists, professionals,
the documents they produce, and the journals and publishers that publish these
documents).
From the scientific output of the members of the metrics and quantitative
information science studies community who have made public their profile on
Google Scholar Citations (GSC), we can develop a picture of this discipline.
Once we‘ve seen the picture of the discipline that can be observed through the
data available in GSC, we also want to compare it to its counterparts in other
EC3 Working Papers Nº 21
7
academic web services, like ResearcherID, a researcher identification system
launched by Thomson Reuters, mainly built upon data from Web of
Science (which has been and still is the go-to source for many researchers in
the field of research evaluation), and other profiling services which arose in the
wake of the Web 2.0 movement: ResearchGate, an academic social network,
and Mendeley, a social reference manager which also offers profiling features.
These are the most widely known tools worldwide for academic profiling.
1
,
2
These tools offer researchers the chance to create an academic profile, as well
as the chance to upload their publications, which are therefore available for
other researchers to access, download, and comment upon. Researchers can
also feed these databases with other kinds of data (tagging and following
profiles, asking and answering specific questions) which might be useful for the
rest of users in the platform.
In addition, we also include the links to the authors' homepages (the first tool
researchers used to showcase their scientific activities on the Web),
and Twitter, the popular microblogging site, in order to learn how much
presence bibliometricians have in this platform and the kind of communication
activities in which they take part there.
In short, our aim is to present a multifaceted and integral perspective of the
discipline, as well as to provide the opportunity for an easy and intuitive
comparison of these products and the reflections of scientific activity each of
them portrays.
This project can also be considered as an attempt to deconstruct traditional
journal, author, and institutional (mainly university) rankings, which are usually
built upon data from traditional citation databases (Web of Science, Scopus)
and are based exclusively on journal impact indicators. In this product, we are
using a bottom-up approach by analyzing the documents that are either
published by a group of authors associated with the discipline, those which are
published in the main journals of the discipline, or those which use the most
common and significant keywords in the discipline.
This is done in keeping with the widespread notion that the impact of the
various scientific units (documents, individuals, organizations, subject domains)
should be evaluated directly, using appropriate indicators for each unit, and not
by using proxies like, for example, the average impact of the journals where a
researcher‘s or an institution‘s documents are published to evaluate that
researcher or institution.
1
http://www.nature.com/news/online-collaboration-scientists-and-the-social-network-1.15711
2
https://101innovations.wordpress.com/2015/06/23/first-1000-responses-most-popular-tools-
per-research-activity
EC3 Working Papers Nº 21
8
In short, the objectives of this study are essentially the following:
1. Applying Google Scholar Citations to radiograph Bibliometrics as a
discipline, identifying the core authors, documents, journals, and most
influential publishers in the field.
2. Comparing the user metric portraits generated by Google Scholar
Citations to those offered by new platforms for the management of
personal bibliographic profiles (ResearcherID, ResearchGate, and
Mendeley) and content dissemination and communication (Twitter).
3. Testing the completeness, reliability and validity of the information
provided by Google Scholar Citations (to generate disciplinary rankings),
and by the remaining social platforms (to generate complementary
academic mirrors of the scientific community).
EC3 Working Papers Nº 21
9
2. METHODS
2.1. Search and identification of relevant authors
The first step was to identify all authors who have published in the areas of
Bibliometrics, Scientometrics, Informetrics, Webometrics or Altmetrics, and for
whom a Google Scholar Citations (GSC) public profile could be found at the
time the data was collected (24/07/2015).
In order to locate the set of authors relevant to our study (i.e., those who have
published in Bibliometrics and have a public profile in GSC), the following
search strategies were used:
a) Keywords
A search was conducted in core selected journals: Scientometrics, Journal of
Informetrics, Research Evaluation, Cybermetrics, and the ISSI conferences
(International Conference on Scientometrics and Informetrics) with the goal
of extracting the most frequently used and representative words in the
discipline. The selected keywords were:
o Altmetrics
o Bibliometrics
o Citation Analysis
o Citation Count
o H-Index
o Impact Factor
o Informetrics
o Patent Citation
o Quantitative Studies of Science and Technology
o Research Assessment
o Research Evaluation
o Research Policy
o Science and Technology Policy
o Science Evaluation
o Science Policy
o Science Studies
o Scientometrics
o Webometrics
All public GSC profiles containing any of these keywords as one of the
research interests were selected (GSC allows authors to display up to five
research interests).
The lack of normalization in the use of keywords sometimes forced us to
search alternatives of these keywords. These variants included misspelled
words, the same keywords in other languages, etc. As an example, these are
all the variants we found of the keyword ―bibliometrics‖: bibliometric;
bibliometría; bibliometria; bibliometric analysis; bibliometric methods;
bibliometics; bibliometircs; bibliometric analysis in mining sciences;
EC3 Working Papers Nº 21
10
bibliometric mapping; bibliometric studies; bibliometric visualization;
bibliometric.; bibliometrics methodology; bibliometrics of social sciences
and…; bibliometrics.; bibliometrics...; bibliométrie; bibliometry.
b) Institutional affiliation
All the profiles associated with research centers working on Bibliometrics
were also selected. As an example, the profiles with these verified e-mail
domains were selected: cwts.leidenuniv.nl, cwts.nl, science-metrix.com, etc.
c) Additional searches
Since there may be some authors working in the discipline who have created
a public GSC profile, but who haven‘t added significant keywords or
appropiately filled the institution field in their profile, we also conducted a
topic search on Google Scholar (using the same keywords as before) as well
as a journal search (all the documents indexed in Google Scholar published
in the following journals: Scientometrics, Journal of Informetrics, Research
Evaluation, Cybermetrics, and ISSI proceedings), with the aim of finding
authors we might have missed with the previous two strategies. These
searches returned roughly 15,000 documents. Additionally, these searches
allowed us to find documents written by authors with no public GSC profile,
but which are nonetheless extremely relevant to the discipline.
All these searches were conducted on the 24th of July, 2015.
2.2. Filtering and classification of author profiles
Since Google Scholar Citations gives authors complete control over how to set
their profile (personal information, institutional affiliation, research interests, as
well as their scientific production), a systematic manual revision was carried out
in order to:
- Detect false positives: authors whose scientific production doesn‘t have
anything to do with this discipline, even though they labeled themselves
with one or more of the keywords associated with it.
- Classify authors in two categories:
o Core: authors whose scientific production substantially falls within
the field of Bibliometrics.
o Related: authors who have sporadically published bibliometric
studies, or whose field of expertise is closely related to
Scientometrics (social, political, and economic studies about
science), and therefore they can‘t be strictly considered
bibliometricians.
In order to set the limit between the two categories (core and related authors),
we decided to consider as ―core authors‖ those who meet a certain criterion: at
least half of the documents which contribute to their h-index should fall within
the limits of the field of Bibliometrics.
EC3 Working Papers Nº 21
11
We considered the titles of the documents, as well as the publishing channel
where they appeared, focusing our attention in the journals. Our Bradford-like
core of journals about Bibliometrics consisted of six journals (Scientometrics,
Journal of Informetrics, JASIST, Research Evaluation, Research Policy, and
Cybermetrics), followed by other LIS journals which also publish numerous
bibliometric studies (Journal of Information Science, Information Processing &
Management, Journal of Documentation, College Research Libraries, Library
Trends, Online Information Review, Revista Española de Documentación
Científica, Aslib Proceedings, and El Profesional de la Información) and lastly,
journals devoted to social and political studies about science (Social Studies of
Science, Science and Public Policy, Minerva, Journal of Health Services
Research Policy, Technological Forecasting and Social Change, Science
Technology Human Values, Environmental Science Policy, and Current
Science).
In the end, we selected a total of 814 GSC profiles. 398 of them have been
classified as core authors, and the remaining 416 as related authors.
2.3. Expansion to a multi-faceted approach: units of scientific analysis
Once we defined the set of authors, we automatically extracted the top 100
most cited documents for each author from their GSC profile. To this set of
documents, we added the documents we found on our previous topic and
journal searches (the third strategy we used to find authors who work on
Bibliometrics).
After deleting duplicates, a set of roughly 41,000 documents remained. In the
cases where various versions of the same document were found with different
number of citations, the one with the highest citation count was selected. This
list was sorted according to the number of citations.
For each of the top 1,000 most cited documents in this list, both the basic
bibliographic information (especially the sources: journals and book publishers)
and the number of citations according to WoS (Web of Science) were collected.
For those documents that were not indexed in WoS Core Collection (mostly
books), the number of citations in WoS was calculated by searching the
document in WoS‘s Cited Reference Search. By doing this we‘re trying to
highlight the (until now mostly neglected) potential of this tool, which truly offers
a wealth of citation data that could be used for the evaluation of non-WoS
documents.
Lastly, in the cases when a book is a collective work, the number of citations is
the sum of the citations to each of the chapters, in addition to the citations
directed to the book as a whole.
2.4. Expansion to a multi-faceted approach: social media mirrors
The original 814 authors selected in the previous step (with a public profile
created in Google Scholar Citations) were subsequently searched by name in
ResearcherID, ResearchGate, Mendeley, and Twitter. In the cases where a
EC3 Working Papers Nº 21
12
profile was found in any of these platforms, the main indicators provided by the
platform were collected. The data collection from these new academic mirrors
was carried out between the 4th and 10th of September, 2015.
Since the maturity of each platform is an important issue to adequately consider
its degree of use, the official release date of each platform can be found below:
- Google Scholar Citations: a restricted beta release was made on the 20th
of July, 2011. It was opened to the general public on the 16th of
November, 2011.
- ResearcherID: author identification system developed by Thomson
Reuters. Released in January 2008.
- ResearchGate: academic social network created in May 2008.
- Mendeley: social reference manager created in August 2008.
- Twitter: online social networking service that enables users to send and
read short 140-character messages. Released on the 15th of July, 2006.
The URLs to personal homepages were searched and collected as well. In this
case, this information was retrieved from the field ―homepage‖ included in the
Google Scholar Citations profiles of the authors considered. Since there is not
any restriction about the kind of URL an author may use in this field, some
authors choose to save the URL of their profile in other platforms (such as
ResearchGate), or the URL of the research group, institution, or company they
work for, among other cases. In this case, this information was filtered and only
personal or institutional websites managed directly by the authors are analyzed.
2.5. Author-level metrics: list and scope
All the metrics collected from each of the social media platforms analyzed, as
well as their definition and scope can be found in Table 1.
EC3 Working Papers Nº 21
13
Table 1. List and explanation of author-level indicators
INDICATOR
DEFINITION
Citations
Number of citations to all publications. Computed for citations from all years, and
citations since 2010
h-index
The largest number h such that h publications have at least h citations. Computed
for citations from all years, and citations since 2010
i10 index
Number of publications with at least 10 citations. Computed for citations from all
years, and citations since 2010
INDICATOR
DEFINITION
Total
Articles in
Publication
List
The number of items in the publication list
Articles with
Citation
Data
Only articles added from Web of Science Core Collection can be used to generate
citation metrics. The publication list may contain articles from other sources. This
value indicates how many articles from the publication list were used to generate
the metrics
Sum of the
Times Cited
The total number of citations to any of the items in the publication list from Web of
Science Core Collection. The number of citing articles may be smaller than the
sum of the times cited because an article may cite more than one item in the set of
search results
Average
Citations
per Item
The average number of citing articles for all items in the publication list from Web
of Science Core Collection. It is the sum of the times cited divided by the number
of articles used to generate the metrics
h-index
h is the number of articles greater than h that have at least h citations. For
example, an h-index of 20 means that there are 20 items that have 20 citations or
more
INDICATOR
DEFINITION
RG Score
It‘s a metric that measures scientific reputation based on how an author‘s research
is received by his/her peers. The exact method to calculate this metric has not
been made public, but it takes into account how many times the contributions
(papers, data, etc.) an author uploads to ResearchGate are visited and
downloaded, and also by whom (reputation)
Publications
Total number of publications an author has added to his/her profile in
ResearchGate (full-text or no)
Views
Total number of times an author‘s contributions to ResearchGate have been
visualized. This indicator has recently been combined with the ―Downloads‖
indicator to form the new ―Reads‖ indicator, but the data collection for this product
was made before this change came into effect
Downloads
Total number of times an author‘s contributions to ResearchGate have been
downloaded. This indicator has recently been combined with the ―Views‖ indicator
to form the new ―Reads‖ indicator, but the data collection for this product was
made before this change came into effect
Citations
Total number of citations to the documents uploaded to the profile. ResearchGate
generates its own citation database, and they warn this number might not be
exhaustive
Impact
Points
Sum of the JCR impact factors of the journals where the author has published
articles
EC3 Working Papers Nº 21
14
Profile
views
Number of times the author‘s profile has been visited
Following
Number of ResearchGate users the author follows (the author will receive
notifications when those users upload new material to ResearchGate)
Followers
Number of ResearchGate users who follow the author (those ResearchGate will
receive notifications when the author uploads new materials to ResearchGate)
INDICATOR
DEFINITION
Readers
This number represents the total number of times a Mendeley user has added a
document by this author to his/her personal library
Publications
Number of publications the author has uploaded to Mendeley and classified as
―My Publications‖
Followers
Number of Mendeley users who follow the author
Following
Number of Mendeley users the author follows
INDICATOR
DEFINITION
Tweets
Total number of tweets an author has published according to his profile
Followers
Number of Twitter users who follow the tweets published by the author
Following
Number of Twitter users the author follows
Days
registered
Number of days since the author created an account on Twitter
Sum
Retweets
Number of Retweets obtained for the author.
H Retweets
An author has a h-Retweet of ―n‖ when ―n‖ of its tweets has achieved at least ―n‖
Retweets.
2.6. Limitations
Projects of a bibliographic nature like this one can‘t ever reach perfection, and it
is entirely possible that we may have missed relevant authors. The criteria for
selecting the authors were two: first, the existence of a public GSC profile about
the author by 24/07/2015 (when the data collection was made), and second,
that the author works on the fields of Bibliometrics, Scientometrics, Informetrics,
Webometrics, or Altmetrics.
We‘re completely aware that these lists don‘t include all the researchers in the
area, since some haven‘t created a profile, or they haven‘t made it public. We
should note that we made an exception with Eugene Garfield, one of the fathers
of Bibliometrics. Despite the fact that he doesn‘t have a public GSC profile, we
manually searched his production on Google Scholar and computed the same
indicators GSC displays. We believe this Working Paper would be incomplete
without him.
We strongly encourage researchers without a GSC profile, and especially those
who have made important contributions to the development of this field, to bring
together the scattered bibliographic information Google Scholar has already
compiled about their works. Sharing this information would not only greatly
EC3 Working Papers Nº 21
15
benefit their online visibility; it would also be very useful to the rest of the
scientific community.
2.7. Statistical analysis
Spearman correlation (α= 0.05) was applied to all 31 metrics considered in each
of the platforms (excluding personal webpages), and finally a Principal
Component Analysis (Spearman similarity with varimax rotation of axes and
uniform weighting) was applied in order to reveal the relationships among
metrics and platforms as well as the possible existence of metric clusters.
EC3 Working Papers Nº 21
16
3. RESULTS
3.1. The actors of Bibliometrics as a discipline, according to Google
Scholar Citations: authors, documents, journals and publishers
a) Authors
By analyzing the list of most influential authors of the discipline (Table 2) we
noticed that the most prominent positions (top ten) include the founders of the
discipline (Price and Garfield) and the most influential bibliometricians, almost
all of them holders of the Price medal (all except Chen), a prize that recognizes
scientists who have contributed with their work to the development of
Bibliometrics.
Table 2. Top 25 influential core authors in Bibliometrics according to Google Scholar Citations
AUTHOR
GS CITATIONS
H-INDEX
Loet Leydesdorff
26,484
73
Eugene Garfield
22,622
55
Mike Thelwall
13,840
61
Derek J. de Solla Price
13,263
33
Francis Narin
11,297
45
Wolfgang Glänzel
10,796
54
Ronald Rousseau
9,570
42
Chaomei Chen
9,512
43
Anthony (Ton) F.J. van Raan
9,200
53
Ben R Martin
8,975
39
András Schubert
8,655
45
Peter Ingwersen
8,356
35
Henk F. Moed
8,256
46
Blaise Cronin
7,347
43
Henry Small
7,307
32
Tibor Braun
7,231
41
Vasily V. Nalimov
6,343
31
Lutz Bornmann
6,108
40
Belver C. Griffith
5,695
26
Howard D. White
5,569
30
Johan Bollen
5,394
33
Katy Borner
5,326
31
Félix de Moya Anegón
5,074
35
Koenraad Debackere
4,933
32
Jose Maria López Piñero
4,823
31
Bibliometrics received a decisive boost from the personality and the work of
both Price and Garfield, who can be considered the fathers of this discipline. On
the one hand, Price, armed with the theoretical foundations laid by John
Desmond Bernal and Robert K. Merton, set out to systematically apply
quantitative techniques to the History and social studies of Science, developing
the theoretical foundations of Scientometrics, born from the combination of the
Sociology of science, History, Philosophy of science, and Information science.
This approach is characterized by the analysis of the life and activity of Science
EC3 Working Papers Nº 21
17
and scientists from a quantitative perspective. The numbers were used to
characterize the production of knowledge and scientists lives: what they create
and produce, with whom they relate to, the sources they used, and the impact
and influence they provide/receive to/from other scientists, etc.
On the other hand, Garfield made possible that Bibliometrics became a reality
(Mccain 2010; Bensman, 2007): the creation of the "citation index" made
possible the quantification of scientific activity through its main output: the
publications and citations they generate. Since then, citation analysis and all its
variants have become the most widespread analysis technique of this new
specialty (this is evidenced by the significant presence of highly cited
documents that deal with this topic). Garfield defined the phenotype of the
discipline: technology (the basis for the storage and circulation of information) is
at the heart of all its tools. That is, Bibliometrics will evolve at the same rate the
technologies of information and communication do.
The map of Bibliometrics can also be discerned by analyzing the rest of the
authors in the list: the Hungarian school (both Eastern Europe and Russia, like
Nalimov), the Dutch school (with its various branches in Leiden and
Amsterdam), the Belgian school (with Egghe and Rousseau), the North
American School (Small, Griffith, and White), the Spanish school (with López
Piñero, Spanish translator of Price‘s work, and the one who introduced
Bibliometrics in Spain), and the new authors that represent the technological
transformation of the discipline (mainly Thelwall).
b) Documents
An analysis of the list of the 25 most cited documents according to Google
Scholar (Table 3) reveals several issues:
- The importance of the documents that first introduce new techniques and
citation-based indicators, like the ones by Hirsch (3rd), Garfield (9th and
10th), Small (12th), Egghe (23rd), and Griffith and White (37th). Among them
we find the most widely known indicator in Bibliometrics (the impact factor)
and the one that has come to replace it while extending its capabilities (h-
index).
- The excellence both in the work in which Hirsch proposes the h-index and
in the articles about the impact factor highlights the strong orientation of
Bibliometrics towards evaluation in general and the assessment of the
performance of individuals, journals, and institutions... This reveals a clear
link between Bibliometrics and Science policy, and explains the use of
bibliometric indicators and other bibliometric tools by policymakers.
- As we would expect, among the most cited documents we find texts that
have served as textbooks for the discipline (written by Moed, Van Raan,
Eghhe, Rousseau, etc.).
- The anomalous institutionalization process of the discipline. The main
―bibliometric laws‖ which still hold true today where established at the
dawn of the discipline, even before it was fully instituted (Lotka, Zipf,
Bradford), and were developed by authors working outside the discipline.
The same happened with the proposal of the h-index by Hirsch,
EC3 Working Papers Nº 21
18
elaborated by this physician in his ―leisure time‖. Bibliometrics is often
revolutionized from outside Bibliometrics.
- The great relevance of some topics such as the "Triple Helix" by
Leydersdorff, or the social networks by Barabási, which make a big impact
outside the borders of our discipline (Management and Economy in the
first case, and sociometrics and computer science in the second).
Table 3. Top 25 most influential documents in Bibliometrics according to Google Scholar Citations
TITLE
AUTHORS
SOURCE
YEAR
GS
CITATIONS
Little science, big science
de Solla Price
Columbia
University
Press
1963
5,410
An index to quantify an individual's
scientific research output
Hirsch
PNAS
2005
4,860
The dynamics of innovation: from
National Systems and "Mode 2" to
a Triple Helix of university-industry-
government relations
Etzkowitz &
Leydesdorff
Research
Policy
2000
4,414
Universities and the global
knowledge economy: a triple helix
of university-industry-government
relations
Etzkowitz &
Leydesdorff
Pinter Press
1997
2,585
Handbook of Quantitative Science
and Technology Research: The
Use of Publication and Patent
Statistics in Studies of S&T
Systems
Moed; Glänzel
& Schmoch
(ed.)
Springer
2005
2,261
Citation analysis as a tool in
journal evaluation. Journals can be
ranked by frequency and impact of
citations for science policy studies
Garfield
Science
1972
2,166
Citation indexing: Its theory and
application in science, technology,
and humanities
Garfield
Wiley
1979
2,130
The frequency distribution of
scientific productivity
Lotka
Journal of
Washington
Academy
Sciences
1926
2,090
Cocitation in the scientific
literature: A new measure of the
relationship between two
documents
Small
JASIS
1973
1,988
Links and impacts: The influence
of public research on industrial
R&D
Cohen; Nelson
& Walsh
Management
Science
2002
1,881
Evolution of the social network of
scientific collaborations
Barabasi;
Jeong; Neda;
Ravasz;
Schubert &
Vicsek
Physica A
2002
1,851
Citation indexes for science. A new
dimension in documentation
through association of ideas
Garfield
Science
1955
1,783
What is research collaboration?
Katz & Martin
Research
Policy
1997
1,591
EC3 Working Papers Nº 21
19
TITLE
AUTHORS
SOURCE
YEAR
GS
CITATIONS
Handbook of quantitative studies of
science and technology
Van Raan (ed.)
North-Holland
1988
1,510
The history and meaning of the
journal impact factor
Garfield
JAMA
2006
1,487
The increasing linkage between
US technology and public science
Narin; Hamilton
& Olivastro
Research
Policy
1997
1,211
A general theory of bibliometric
and other cumulative advantage
processes
de Solla Price
JASIST
1976
1,148
Statistical bibliography or
bibliometrics?
Pritchard
Journal of
Documentation
1969
1,134
Theory and practise of the g-index
Egghe
Scientometrics
2006
1,113
The Web of knowledge: a
Festschrift in honor of Eugene
Garfield
Garfield; Cronin
& Atkins (ed).
Information
Today
2000
1,102
Visualizing a discipline: An author
co-citation analysis of information
science, 1972-1995
White &
McCain
JASIS
1998
1,100
CiteSpace II: Detecting and
visualizing emerging trends and
transient patterns in scientific
literature
Chen
JASIST
2006
1,083
Citation analysis in research
evaluation
Moed
Springer
2005
1,060
Citation frequency and the value of
patented inventions
Harhoff; Narin;
Scherer &
Vopel
Review of
Economics and
Statistics
1999
1,023
Maps of random walks on complex
networks reveal community
structure
Rosvall &
Bergstrom
PNAS
2008
992
If we pay attention to the distribution of documents according to their typology
(Figure 3), the journal article stands out overwhelmingly (89% of all 1,069
documents processed), showing that formal papers published in peer reviewed
journals stand as the main scientific vehicles in this social science discipline.
Figure 3. Distribution of highly cited documents in Bibliometrics according to Google Scholar
citations (n= 1,069)
EC3 Working Papers Nº 21
20
The presence of books (5%) is smaller, but this figure may be misleading. An
analysis of the top highly cited documents according to Google Scholar citations
shows that within the top 25 documents, 8 of them are books (14 within the top
50). Obviously, the number of published books is lower than the number of
articles. The presence of documents from the remaining categories is lower:
Book chapters (3%), and other material (including dissertation theses, reports,
etc.; 2%). Lastly, the results obtained for conference proceedings (1%) reveal a
low impact of this scientific communication channel.
c) Journals
The third unit analyzed is the journals in which highly cited documents have
been published (i.e., considering only the top 1,000 most cited documents). In
Table 4 we provide the top 25 journals according to the number of highly cited
documents published. Additionally, we show the total number of citations
received by these articles, the percentage of citations per article (C/A), the
percentage of highly cited documents in the sample (HCA) and the distribution
of citations.
Scientometrics is the journal with more articles published within the 1,000 most
cited documents (284 articles). It is thus the most influential journal in the
discipline. Its birth in 1978 was a milestone in the process of institutionalization
of the discipline.
The second place is occupied by JASIST (137 articles). This fact shows the
important role of this journal in Bibliometrics, although its scope is broader. This
journal has maintained since its inception a strong link between Information
Science and Bibliometrics, though some authors have noticed a slight
specialization towards Bibliometrics over time (Nicolaisen & Frandsen, 2015).
Journal of informetrics, focused exclusively on Bibliometrics, Scientometrics,
Webometrics, and Altmetrics, appears in the fourth position (36 articles). The
young age of this journal (it was created in 2007) explains why there isn‘t a
greater number of articles published in this journal among the most cited
documents in the discipline.
The connection between Library and Information Science and Bibliometrics is
noticeable through the presence of other important LIS journals in the list, such
as Journal of Documentation, Journal of Information Science, Library Trends, or
Aslib Proceedings. This connection has been a matter of public record for a
long time now (White & McCain 1998; Larivière, Sugimoto, Cronin 2012,
Larivière 2012).
Its connections with the field of web technologies from an information science
perspective is strongly marked as well (Cybermetrics, Online Information
Review). Additionally, we can see that journals oriented towards the Social
Studies of Science (such as Research Policy, Social Studies of Science, and
Science and Public Policy) also have strong ties to Bibliometrics.
EC3 Working Papers Nº 21
21
Lastly, the role of multidisciplinary journals (such as Nature, Science, PNAS or
PLoS One) should not be forgotten. If we analyze the number of citations
instead of the number of articles published, we find the same first three journals
occupying the first positions (Scientometrics, JASIST, and Research Policy), but
the data also shows a great impact of articles published outside the core
journals of the discipline. Science gets 9,219 citations from only 8 articles
whereas PNAS achieves 7,642 citations from 9 articles, and PLoS One gets
2,376 citations from 13 articles (the figures for Nature are lower, with ―only‖
1,871 citations from 10 articles).
Table 4. Top 25 most influential journals in Bibliometrics according to Google Scholar Citations
JOURNAL
ARTICLES
CITATIONS
C/A
HCA
(%)
CITATIONS
(%)
Scientometrics
284
44,384
156
29.8
22.5
JASIST
137
27,021
197
14.4
13.7
Research Policy
57
18,866
330
6.0
9.6
Journal of
Informetrics
36
5,052
140
3.8
2.6
Journal of
Documentation
25
5,538
221
2.6
2.8
Information
Processing &
Management
24
4,404
183
2.5
2.2
Journal of
Information Science
20
3,815
190
2.1
1.9
Research
Evaluation
18
2,126
118
1.9
1.1
ARIST
14
3,621
258
1.5
1.8
Social Studies of
Science
13
3,204
246
1.4
1.6
Science and Public
Policy
13
2,875
221
1.4
1.5
Plos One
13
2,376
182
1.4
1.2
Nature
10
1,871
187
1.0
1.0
Current Contents
10
1,696
169
1.0
0.9
PNAS
9
7,642
849
0.9
3.9
Science
8
9,219
1,152
0.8
4.7
Library Trends
7
1,230
175
0.7
0.6
Medicina Clinica
6
958
159
0.6
0.5
Online Information
Review
6
806
134
0.6
0.4
Science
Technology &
Human Values
5
946
189
0.5
0.5
Aslib Proceedings
5
765
153
0.5
0.4
Cybermetrics
5
627
125
0.5
0.3
American
Psychologist
4
1,026
256
0,4
0,5
World Patent
Information
4
726
181
0.4
0.4
Ethics in Science
and Environmental
Politics
4
687
171
0.4
0.3
EC3 Working Papers Nº 21
22
d) Book publishers
The last unit of analysis is the book publishers. Table 5 shows the top 20
publishers according to the percentage of highly cited documents (top 1,000).
Additionally, the number of documents, citations (total and percentage of
citations respect to the total) and citations per document are offered.
The first position is occupied by Springer, with 10 documents positioned within
the set of highly cited books, and receiving 5,766 citations (14.3% of all citations
to book publishers). Information Today (10.9%) and Wiley (9.1%) stand on the
second and third position respectively.
Table 5. Top 25 most influential publishers in Bibliometrics according to Google Scholar Citations
PUBLISHER
HC
HC
(%)
CITATIONS
CITATIONS
(%)
C/A
Springer
10
18,2
5,766
14,3
576.60
Information Today
6
10,9
1,635
4,0
272.50
Wiley
5
9,1
3,121
7,7
624.20
Lexington
4
7,3
1,627
4,0
406.75
Sage
4
7,3
1,324
3,3
331.00
UFMG
4
7,3
845
2,1
211.25
University of Chicago Press
3
5,5
6,874
17,0
2,291.33
Russell Sage Foundation
3
5,5
3,836
9,5
1,278.67
North-Holland
3
5,5
2,130
5,3
710.00
Blackwell
2
3,6
1,132
2,8
566.00
Elsevier
2
3,6
1,071
2,7
535.50
Taylor Graham
2
3,6
688
1,7
344.00
Scarecrow Press
2
3,6
416
1,0
208.00
ISSI
2
3,6
276
0,7
138.00
Ablex
2
3,6
193
0,5
96.50
FECYT
2
3,6
193
0,5
96.50
Columbia University Press
1
1,8
5,410
13,4
5,410.00
Pinter Press
1
1,8
2,585
6,4
2,585.00
Yale University Press
1
1,8
936
2,3
936.00
MIT Press
1
1,8
710
1,8
710.00
We can observe that all publishers achieve high numbers of citations per
document. In this case, we should highlight the performance of university
presses (such as University of Chicago, Columbia, Yale, or MIT), with a very
low presence in terms of productivity but an impressive impact in the number of
citations. The ability to attract well-established authors in order to edit
specialized books makes a great difference in book publisher rankings.
3.2. Online presence of the bibliometric community
Scientists traditionally communicated with their communities both through
informal means (letters, meetings, seminars, conferences ...) and formal means
(books, journal articles, patents, patents, etc.), and in both of them the scope of
these communications was limited by the printed technology in which the
contents were transmitted. Today, since the birth of the Web, which brought the
chance to create personal pages, and with the emergence of academic social
EC3 Working Papers Nº 21
23
networks, researchers can display their work through a rich variety of channels
and electronic formats.
Studies of the level of web presence and impact of scientists‘ through their
personal websites have already been carried out. Barjak, Li & Thelwall (2007)
analyzed data from 456 scientists from five scientific disciplines in six European
countries, whereas Mas-Bleda, Aguillo, (2013) and Más-Bleda et al (2014) put
their focus on 1,498 highly cited researchers working at European institutions,
distributed in 22 different countries, using data extracted from the
ISIHighlyCited.com database.
In the field of Bibliometrics, the pioneer work by Haustein et al (2014) should
also be highlighted. In this study, 1,136 documents authored by the 57
presenters of the 2010 STI conference in Leiden (57 researchers, who together
had authored 1,136 papers) were collected using WoS and Scopus. After this,
the scholarly and professional social media presence of these authors in
several platforms was measured (Google Scholar Citations, LinkedIn, Twitter,
Academia.edu, ResearchGate and ORCID).
In this work we intend to expand this sample by considering the social presence
of the whole bibliometric community as well as other researchers who are
related to the discipline in some way. A total of 814 researchers (398
bibliometricians and 416 researchers who have sporadically published
bibliometric studies) have been analyzed.
In Table 6 we find the distribution of authors according to the number of
platforms in which they have created a personal profile, regardless of their
impact or the degree to which these profiles are updated. We highlight the
following points:
- The degree of social presence is high. All 814 authors have at least a
personal profile created in one platform; 14.7% of the authors are visible
in only one platform.
- Authors with two (19.1%), three (23.5%), or four (21.1%) profiles are the
more numerous groups.
- No significant differences between core and related authors are found.
- There is a small group of authors (6.2%) with high media visibility
(presence in all social media analyzed), being among them some of the
most influential bibliometricians (such as Loet Leydesdorff, Mike
Thelwall, Chaomei Chen, Lutz Bornmann, Félix de Moya Anegón, Katy
Borner, Judit Bar-Ilan, Nees Jan van Eck, or Isidro F. Aguillo, among
others).
EC3 Working Papers Nº 21
24
Table 6. Social presence of the bibliometric community
NUMBER OF PLATFORMS
AUTHORS
CORE
RELATED
TOTAL
6
32
19
51
5
72
51
123
4
76
96
172
3
80
112
192
2
78
78
156
1
60
60
120
TOTAL
398
416
814
The use of each specific social platform is shown in Table 7. The main results
derived from these data are the following:
- ResearchGate is (after Google Scholar Citations) the second most used
platform by these authors (66.7%), followed at some distance by
Mendeley (41.28%) and homepages (41.15%).
- The number of Mendeley profiles is high, although this data by itself is
misleading, since 17.1% of the profiles (68 out of 397) are basically
empty. ResearcherID is also affected by this issue (34.45% of the
profiles are empty); as is Twitter (47% of the 240 authors with a Twitter
profile have published less than 100 tweets).
- ResearcherID presents a wider acceptance among core authors (45.7%)
than related authors (35.1%)
- Twitter is the least used platform, since only 33.17% of core authors (and
25.96% of related authors) have created a Twitter profile.
- Personal homepages are widely used by authors, although this
denomination covers a wide range of different website typologies
(personal websites outside institutions, institutional websites not
managed by authors). The use of social platforms as personal sites is
common (22 authors considered their profiles in other academic social
sites such as ResearchGate, Academia.edu, Mendeley, and ImpactStory
as their personal websites).
- Core and related authors present similar behavior as regards their
presence on these social platforms, although there is a slightly higher
rate of core authors on Twitter, ResearcherID, and Mendeley than there
is of related authors.
EC3 Working Papers Nº 21
25
Table 7. Degree of use of social platforms by type of author
WEB
PLATFORMS
AUTHORS
CORE
%
RELATED
%
TOTAL
%
* Google Scholar Citations
398
100
416
100
814
100
ResearcherGate
260
65.33
283
68.03
543
66.71
Mendeley
171
42.96
165
39.66
336
41.28
** Homepage
158
39.69
177
42.54
335
41.15
ResearcherID
182
45.73
146
35.10
328
40.29
Twitter
132
33.17
108
25.96
240
29.48
* All authors in the sample have a profile in GSC. ** ResearchGate and Academia.edu URLs were discarded.
Figure 4 shows the combination of profiles used by the authors (core and
related) of the bibliometric community. It should be reminded that all authors in
our sample have Google Scholar Citation profiles (this was the main selection
criteria).
Personal webpages have been omitted from this analysis since they represent
another dimension of web presence, different from those offered by social
platforms and academic profiles.
Figure 4. Combination of profiles used by the bibliometricians in our sample
As we can see in Figure 4, there is great number of researchers who only have
a profile in Google Scholar Citations (159). There are also many authors who
only have a profile in GSC and ResearchGate (142). The number of
researchers who have an account in all the platforms analyzed in this study
EC3 Working Papers Nº 21
26
(GSC, ResearcherID, Mendeley, Twitter, and ResearchGate) is 93 (11.4% of
our sample).
The remaining combinations seem to be more unusual. For example, there are
only 12 authors who use only GSC and Twitter, and 14 authors who use only
GSC, Mendeley, and Twitter. In a similar manner, there are only 11 authors who
use only GSC, ResearcherID, and Mendeley.
These results are similar to the ones offered by Van Noorden (2014) about the
presence of scientists in social networks and the provisional results by Bosman
and Kramer (2015).
Despite the fact that this sample suffers from a bias in favor of Google Scholar
Citations because of how the data were collected, there is no doubt that GSC is
the platform authors currently prefer to display their publications, followed at a
distance by ResearchGate, but a distance that is increasingly shorter. 66.7% of
the authors with a GSC profile have also a ResearchGate profile. This is
significant enough, although these results must be tempered by the degree of
use and update frequency of each platform, aspects which will be discussed
later in greater detail.
These results should be especially contextualized within the bibliometric
community, which undoubtedly has a certain bias towards using these
platforms, because these platforms are sometimes objects of study themselves.
Differences in the degree of presence on social platforms in different fields of
knowledge should be expected, as González-Díaz, Iglesias-García, and Codina
(2015) have recently proved in their analysis of the discipline of Communication.
3.3. Comparing social platform metrics: from citations to followers
After analyzing the academic output and impact for the bibliometric community
using Google Scholar Citations, and describing the preferences of the members
of this scientific community for social interaction in the Web, in this section we
are going to analyze the correlation between these metrics. Firstly, all metrics
associated with Google Scholar Citation profiles, and secondly, all metrics
associated and offered by each of the social platforms analyzed (Mendeley,
ResearcherID, ResearchGate, and Twitter). Personal webpages have been
excluded from this analysis.
By way of illustration, in Table 8 we show the median of the main metrics
evaluated so that we can compare the performance or size of similar indicators
in each web platform. In this sense, we highlight the following issues:
- Regarding the Total citations received‖, the higher median value
corresponds to Google Scholar (156), followed by ResearchGate (85)
and ResearcherID (63).
- As to the h-index, Google Scholar obtains a score of 6; whereas in
ResearcherID this value is lower (4).
- Regarding academic output, ResearchGate achieves the first position
(27), followed at a distance by ResearcherID (15) and Mendeley (9). The
EC3 Working Papers Nº 21
27
number of records stored in each Google Scholar Citation profile is not
available in this work.
- Regarding the social interaction features (―following / followed by‖), users
in both ResearchGate and Mendeley show a slightly passive behavior:
users tend to be followed by many people, but they do not follow many
other users. Interestingly, the opposite behavior is found in Twitter,
where scholars tend to follow many users, but it seems harder to be
followed by others. Since ResearchGate and Mendeley deal exclusively
with academic audiences, a logical explanation may be that respected
scholars who create an account are widely followed, but they do not tend
to follow other users. Nonetheless, in the open space defined by Twitter,
the situation is just the opposite: gaining followers implies an active
participation in the platform.
Table 8. Median of principal metrics
SOURCE
METRIC
MEDIAN
Google Scholar
(n=811)
Citations_total
156
Citations_last5
117
H-index_total
6
H-index_last5
5
i10_total
4
i10_last5
3
ResearcherID
(n=275)
Total_articles
15
Articles_cited
11
Times_cited
63
Average_citations
5.75
H-index
4
ResearchGate
(n=515)
RG Score
13.82
Publications
27
Impact_points
12.97
Followers
38
Following
23
Downloads
802
Views
1845
Citations
85
Profile_views
696
Mendeley
(n= 185)
Publications
9
Readers
93
Followers
3
Following
2
Twitter
(n=226)
Tweets
153.5
Followers
99
Following
130
In Table 9 we show all correlations achieved among each of the 31 metrics
considered in this study (α= 0.05), whereas in Figure 5 we show the results of a
Principal Component Analysis (PCA).
EC3 Working Papers Nº 21
28
Figure 5. Principal Component Analysis for 31 metrics associated with bibliometricians’ social
platform profiles
The main results are:
- We find two clear dimensions: at the top we can see all metrics related to
connectivity and popularity (followers), and at the bottom, all metrics
related to academic performance. This second group can further be
divided into usage metrics (views and downloads) and citation metrics.
ResearchGate provides examples for these two faces of academic
performance, since Google Scholar Citations profiles do not offer data
about downloads or reads.
- All metrics provided by Google Scholar (both citations and h-index)
correlate strongly among themselves.
- We find a clear separation between the usage (views and downloads) and
citation metrics (Citations, Impact Points) provided by ResearchGate. The
RG Score for example displays a high correlation to metrics from Google
Scholar Citations: i.e. total citations (r= 0.89) and the h-index (r= 0.92).
- The number of readers in Mendeley is connected to the usage metrics
offered by ResearchGate, and strongly correlates to Google’s total
citations (r= 0.77), Google‘s h-index (r= 0.82), and the RG Score (r= 0.75).
The number of documents in Mendeley is far from the Mendeley readers
in this PCA, probably because Mendeley profiles aren‘t updated as
EC3 Working Papers Nº 21
29
regularly as GSC profiles. Of course, this also affects the combined metric
readers per document.
- Indicators from ResearcherID strongly correlate among themselves, but
are slightly separated from other citation metrics (those from Google
Scholar and ResearchGate). This issue can probably be explained by the
low regularity with which ResearcherID profiles are updated. In view of the
results, this isolation may be used as a mechanism to check the
currentness (or lack thereof) of a profile in ResearcherID.
- All metrics associated with the number of followers (all Twitter metrics and
their counterparts in ResearchGate and Mendeley) correlate among
themselves, and are separated from the citation metrics. Curiously
enough, the number of followers offered by ResearchGate is, within the
group of connectivity metrics, the one which is closest to the usage
metrics, serving in fact as a bridge between the two groups. This may
mean that networking metrics from academic social networks correlate
better with usage metrics than networking metrics from Twitter.
Mendeley‘s networking metrics, however, are placed closer to Twitter‘s
metrics.
- The impact of Tweets (measured by Retweets) is closer to the academic
side. In any case, their correlation with impact measures is statistically
significant (α=0.05). The correlation of Sum Retweets and H-Retweets
with Google Scholar total citations is 0.44 and 0.45 respectively.
- The number of days that a Twitter account has been active does not seem
to correlate with any other Twitter metric. Unlike in online marketing, time
is not a critic factor to achieve followers. Academic prestige and activity
(number of Tweets tweeted) may be the most important parameters to
achieve a great number of Twitter followers.
EC3 Working Papers Nº 21
30
Table 9. Correlation analysis (Spearman) for 31 metrics associated with bibliometricians’ social platform profiles
PLATFORM
GOOGLE SCHOLAR
RESEARCHERID
RESEARCHGATE
MENDELEY
TWITTER
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
GOOGLE
SCHOLAR
1
1.00
0.99
0.97
0.96
0.97
0.97
0.57
0.61
0.67
0.62
0.66
0.89
0.86
0.86
0.05
0.43
0.78
0.87
0.95
0.60
0.57
0.77
0.57
0.11
0.02
0.17
0.21
-0.06
0.08
0.45
0.46
2
0.99
1.00
0.97
0.97
0.96
0.97
0.56
0.62
0.67
0.63
0.67
0.91
0.88
0.88
0.06
0.46
0.81
0.90
0.94
0.63
0.58
0.79
0.61
0.13
0.05
0.18
0.21
-0.04
0.07
0.44
0.46
3
0.97
0.97
1.00
0.99
0.97
0.98
0.60
0.64
0.67
0.60
0.68
0.92
0.91
0.86
0.10
0.50
0.84
0.91
0.92
0.66
0.62
0.82
0.61
0.12
0.02
0.18
0.22
-0.05
0.06
0.44
0.46
4
0.96
0.97
0.99
1.00
0.97
0.98
0.57
0.63
0.66
0.59
0.67
0.93
0.90
0.87
0.07
0.50
0.84
0.91
0.92
0.66
0.59
0.81
0.63
0.12
0.03
0.17
0.22
-0.05
0.07
0.43
0.45
5
0.97
0.96
0.97
0.97
1.00
0.99
0.58
0.63
0.65
0.57
0.66
0.88
0.88
0.86
0.06
0.46
0.80
0.88
0.93
0.61
0.59
0.80
0.60
0.10
0.01
0.17
0.21
-0.08
0.09
0.42
0.43
6
0.97
0.97
0.98
0.98
0.99
1.00
0.56
0.62
0.65
0.58
0.66
0.90
0.87
0.87
0.05
0.47
0.81
0.88
0.94
0.62
0.59
0.80
0.62
0.10
0.01
0.17
0.21
-0.08
0.08
0.42
0.44
RESEARCHERID
7
0.57
0.56
0.60
0.57
0.58
0.56
1.00
0.91
0.88
0.78
0.89
0.59
0.59
0.61
0.03
0.22
0.48
0.59
0.57
0.44
0.54
0.58
0.37
0.06
0.01
-0.04
0.02
-0.11
0.08
0.21
0.25
8
0.61
0.62
0.64
0.63
0.63
0.62
0.91
1.00
0.96
0.85
0.97
0.67
0.65
0.71
-0.02
0.23
0.55
0.65
0.62
0.45
0.53
0.59
0.40
0.01
-0.04
-0.08
-0.02
-0.18
0.11
0.14
0.20
9
0.67
0.67
0.67
0.66
0.65
0.65
0.88
0.96
1.00
0.95
0.99
0.69
0.63
0.73
-0.03
0.20
0.56
0.67
0.69
0.48
0.52
0.62
0.45
0.01
-0.05
-0.06
0.00
-0.15
0.08
0.17
0.22
10
0.62
0.63
0.60
0.59
0.57
0.58
0.78
0.85
0.95
1.00
0.93
0.60
0.54
0.65
-0.04
0.14
0.50
0.61
0.65
0.46
0.50
0.58
0.41
0.03
-0.01
-0.03
0.02
-0.10
0.09
0.21
0.25
11
0.66
0.67
0.68
0.67
0.66
0.66
0.89
0.97
0.99
0.93
1.00
0.70
0.65
0.73
-0.02
0.22
0.57
0.68
0.68
0.50
0.52
0.62
0.46
0.01
-0.05
-0.07
0.00
-0.15
0.09
0.16
0.21
RESEARCHGATE
12
0.89
0.91
0.92
0.93
0.88
0.90
0.59
0.67
0.69
0.60
0.70
1.00
0.87
0.89
0.15
0.51
0.83
0.91
0.90
0.69
0.52
0.75
0.62
0.11
0.02
0.12
0.20
-0.02
0.01
0.37
0.39
13
0.86
0.88
0.91
0.90
0.88
0.87
0.59
0.65
0.63
0.54
0.65
0.87
1.00
0.78
0.26
0.63
0.89
0.94
0.83
0.70
0.67
0.77
0.43
0.19
0.12
0.18
0.20
-0.04
0.10
0.38
0.40
14
0.86
0.88
0.86
0.87
0.86
0.87
0.61
0.71
0.73
0.65
0.73
0.89
0.78
1.00
-0.04
0.32
0.68
0.79
0.89
0.48
0.45
0.69
0.59
0.02
-0.07
0.01
0.09
-0.15
0.05
0.34
0.37
15
0.05
0.06
0.10
0.07
0.06
0.05
0.03
-0.02
-0.03
-0.04
-0.02
0.15
0.26
-0.04
1.00
0.70
0.34
0.26
0.06
0.42
0.30
0.09
-0.24
0.17
0.14
0.16
0.13
0.25
-0.12
0.09
0.11
16
0.43
0.46
0.50
0.50
0.46
0.47
0.22
0.23
0.20
0.14
0.22
0.51
0.63
0.32
0.70
1.00
0.69
0.63
0.42
0.71
0.56
0.49
0.16
0.29
0.20
0.21
0.23
0.08
-0.03
0.24
0.29
17
0.78
0.81
0.84
0.84
0.80
0.81
0.48
0.55
0.56
0.50
0.57
0.83
0.89
0.68
0.34
0.69
1.00
0.95
0.75
0.82
0.64
0.74
0.44
0.25
0.15
0.16
0.20
-0.01
0.02
0.32
0.34
18
0.87
0.90
0.91
0.91
0.88
0.88
0.59
0.65
0.67
0.61
0.68
0.91
0.94
0.79
0.26
0.63
0.95
1.00
0.86
0.80
0.65
0.78
0.49
0.24
0.16
0.18
0.23
0.00
0.10
0.40
0.42
19
0.95
0.94
0.92
0.92
0.93
0.94
0.57
0.62
0.69
0.65
0.68
0.90
0.83
0.89
0.06
0.42
0.75
0.86
1.00
0.58
0.53
0.78
0.61
0.07
-0.02
0.07
0.13
-0.12
0.06
0.35
0.36
20
0.60
0.63
0.66
0.66
0.61
0.62
0.44
0.45
0.48
0.46
0.50
0.69
0.70
0.48
0.42
0.71
0.82
0.80
0.58
1.00
0.54
0.61
0.38
0.22
0.13
0.18
0.23
0.06
0.09
0.28
0.32
MENDELEY
21
0.57
0.58
0.62
0.59
0.59
0.59
0.54
0.53
0.52
0.50
0.52
0.52
0.67
0.45
0.30
0.56
0.64
0.65
0.53
0.54
1.00
0.83
0.27
0.43
0.36
0.24
0.21
0.12
0.06
0.35
0.39
22
0.77
0.79
0.82
0.81
0.80
0.80
0.58
0.59
0.62
0.58
0.62
0.75
0.77
0.69
0.09
0.49
0.74
0.78
0.78
0.61
0.83
1.00
0.72
0.26
0.17
0.17
0.19
0.00
0.00
0.35
0.38
23
0.57
0.61
0.61
0.63
0.60
0.62
0.37
0.40
0.45
0.41
0.46
0.62
0.43
0.59
-0.24
0.16
0.44
0.49
0.61
0.38
0.27
0.72
1.00
-0.10
-0.17
-0.05
0.04
-0.15
-0.06
0.14
0.14
24
0.11
0.13
0.12
0.12
0.10
0.10
0.06
0.01
0.01
0.03
0.01
0.11
0.19
0.02
0.17
0.29
0.25
0.24
0.07
0.22
0.43
0.26
-0.10
1.00
0.96
0.46
0.43
0.42
0.24
0.42
0.43
25
0.02
0.05
0.02
0.03
0.01
0.01
0.01
-0.04
-0.05
-0.01
-0.05
0.02
0.12
-0.07
0.14
0.20
0.15
0.16
-0.02
0.13
0.36
0.17
-0.17
0.96
1.00
0.46
0.41
0.45
0.27
0.41
0.41
TWITTER
26
0.17
0.18
0.18
0.17
0.17
0.17
-0.04
-0.08
-0.06
-0.03
-0.07
0.12
0.18
0.01
0.16
0.21
0.16
0.18
0.07
0.18
0.24
0.17
-0.05
0.46
0.46
1.00
0.87
0.77
0.29
0.71
0.69
27
0.21
0.21
0.22
0.22
0.21
0.21
0.02
-0.02
0.00
0.02
0.00
0.20
0.20
0.09
0.13
0.23
0.20
0.23
0.13
0.23
0.21
0.19
0.04
0.43
0.41
0.87
1.00
0.81
0.40
0.78
0.77
28
-0.06
-0.04
-0.05
-0.05
-0.08
-0.08
-0.11
-0.18
-0.15
-0.10
-0.15
-0.02
-0.04
-0.15
0.25
0.08
-0.01
0.00
-0.12
0.06
0.12
0.00
-0.15
0.42
0.45
0.77
0.81
1.00
0.18
0.55
0.53
29
0.08
0.07
0.06
0.07
0.09
0.08
0.08
0.11
0.08
0.09
0.09
0.01
0.10
0.05
-0.12
-0.03
0.02
0.10
0.06
0.09
0.06
0.00
-0.06
0.24
0.27
0.29
0.40
0.18
1.00
0.30
0.32
30
0.45
0.44
0.44
0.43
0.42
0.42
0.21
0.14
0.17
0.21
0.16
0.37
0.38
0.34
0.09
0.24
0.32
0.40
0.35
0.28
0.35
0.35
0.14
0.42
0.41
0.71
0.78
0.55
0.30
1.00
0.98
31
0.46
0.46
0.46
0.45
0.43
0.44
0.25
0.20
0.22
0.25
0.21
0.39
0.40
0.37
0.11
0.29
0.34
0.42
0.36
0.32
0.39
0.38
0.14
0.43
0.41
0.69
0.77
0.53
0.32
0.98
1.00
COD
METRIC
COD
METRIC
COD
METRIC
1
GS_citations_total
12
RG_score
23
MEND_readers / document
2
GS_citations_last5
13
RG_publications
24
MEND_followers
3
GS_hindex_total
14
RG_impact_points
25
MEND_following
4
GS_hindex_last5
15
RG_following
26
TW_tweets
5
GS_i10_total
16
RG_followers
27
TW_followers
6
GS_i10_last5
17
RG_downloads
28
TW_following
7
RID_n_total_articles
18
RG_views
29
TW_dias
8
RID _n_articles_cit
19
RG_citations
30
TW_sum_retweets
9
RID _sum_times_cited
20
RG_profile_views
31
TW_h_retweets
10
RID _average_cit
21
MEND_pub
11
RID _hindex
22
MEND_readers
EC3 Working Papers Nº 21
31
3.4. Data reliability
After describing the multifaceted presence (authors, documents, and sources)
of the bibliometric community in Google Scholar Citations, describing the
presence of the authors of this community in other social platforms, and
analyzing the possible correlation between all metrics offered by these
platforms, it is absolutely essential to face the discussion about the reliability of
these metrics and platforms. In Science, if the data source and the instrument
(that stores that data and computes the measures) are not reliable, the results
achieved are meaningless and scientifically irrelevant; such groundless results
should not be considered as proper scientific results until their validity is proven.
In Bibliometrics, there is a large tradition of studies addressing the errors related
to the correct assignment of citations to documents in bibliometric databases,
as well as the deficiencies in the design or application of bibliometric indicators
(Sher, Garfield & Elias, 1966; Poyer, 1979; Garfield, 1983; Moed & Vriens,
1989; Garfield, 1990; Garcia-Perez, 2010; Franceschini, Maisano &
Mastrogiacomo, 2015).
Since these platforms are quite new, there are still few in-depth empirical
studies using representative samples which may allow us to make informed
assertions about the reliability of these platforms. So far, there are only a few
isolated analyses pointing out errors, inaccuracies and inconsistencies.
Regrettably, there are not many of these interesting works, and they don‘t often
go beyond reporting a few anecdotal issues. In this respect, we must highlight
the great impact of Peter Jacsó‘s works, who analyzed the strengths and
specially the weaknesses of Google Scholar (Jacsó 2005; 2006a; 2006b; 2008;
2010).
In order to contextualize all the data offered previously in this work, we present
a final section providing insights about the different kinds of errors found in each
of the platforms, with a special emphasis in Google Scholar, since it has been
our main source of data.
3.4.1. The uncontrolled giant: Google Scholar & Google Scholar Citations
The errors that can compromise the metric portrait of an author offered by
Google Scholar can be grouped into two main sections. First, the errors Google
Scholar sometimes makes when it indexes a document or when it assigns
citations to it. Second, the specific errors that are sometimes made during the
creation of a Google Scholar Citations profile.
The former are a logical consequence of the tricky and complex task that is
automatically searching the current academic papers available in the net. This
task also involves merging in only one record all possible versions of the same
work, and linking to it all documents in which it is cited (keeping in mind that
these documents and references can be presented in the most varied formats).
The latter are the ultimate responsibility of the author, who must periodically
revise his/her profile in order to eliminate misattributed documents which might
been included in the automatic weekly updates, clean the records by merging
EC3 Working Papers Nº 21
32
different versions of the same document when Google Scholar‘s algorithms are
not able to detect their similarity, as well as improve and complete the
bibliographic references of these documents (filling in blank fields in a document
when Google Scholar hasn‘t been able to find that information).
Next, we classify, describe, and illustrate some of the most common mistakes in
Google Scholar:
a) Incorrect identification of the title of the document
Google Scholar always tries to extract bibliographic information from the HTML
Meta tags in a webpage. When there are no Meta tags available, it parses the
webpage itself (the HTML code of the page, or even PDFs themselves). Even
though its spiders are able to successfully parse pages with a quite broad range
of different structures, and despite the fact that they have published a very clear
set of inclusion guidelines, some parsing errors occasionally arise for
documents extracted from websites with unusual layouts. It is not rare in these
cases that an incorrect text string is selected as the title of the document. In
Figure 6 we illustrate an example in which an incorrect string
(―www.redalyc.org) has been selected as the title of the document in several
records, probably because it is the string that is featured with a higher font size
in the first page of the PDF document from which Google Scholar has parsed
the bibliographic information. Note that the authors and the source publications
are correctly assigned.
Figure 6. Document titles improperly identified in Google Scholar: URLs
In many other occasions, other text strings, such as the author‘s name and/or
the year of publication, are incorrectly selected as the title of the document. In
Figure 7 we can observe how ―de Solla has been selected as the title in many
records.
EC3 Working Papers Nº 21
33
Figure 7. Author names incorrectly selected as document titles in Google Scholar
Source: https://scholar.google.com/scholar?start=0&q=allintitle:+%22de+solla%22+-Moravcsik+-
gulls+-comments+-1922+-foreword+-Toward+-tribute+-space+-pensamento+-address+-
appreciation&hl=en&as_sdt=0,5
b) Ghost authors
The topic of ghost authors, citations, and documents was approached by Jacsó
in numerous works, mostly before Google Scholar Citations was launched.
Although profiles have served to filter and correct many mistakes, some of them
still persist, especially if authors do not clean their personal profiles. In Figure 8
we can see one such example. In this case, the record only displays one person
as the author of the article (Carmen Martín Moreno), when in fact the article was
written by two authors (Elías Sanz-Casado and Carmen Martín Moreno).
In this case, Google Scholar extracted the bibliographic information from the
HTML Meta tags in the website of the journal where the article was published,
but, as we can see in Figure 8 (bottom image), these metadata were already
incorrect (the title should read ―Técnicas bibliométricas aplicadas a los estudios
de usuarios‖), and incomplete (Elías Sanz-Casado is missing from the record).
Nonetheless, thanks to Google Scholar Citations, Elías was able to add the
document to his profile, even if his name is still missing from the authors field
(Figure 8, top left).
EC3 Working Papers Nº 21
34
Figure 8. Missing authors in primary versions of documents in Google Scholar
c) Book reviews indexed as books
Among the most common mistakes in document identification is mistaking the
review of a book for the book itself. In Figure 9 we show two different records
which correspond with book reviews of the work ―Introduction to informetrics.
Quantitative methods in Library, Documentation and Information Science‖ by
Egghe and Rousseau. At a first glance the first record (Figure 9; top) looks like
a normal record, since the title and authors of the book have been correctly
identified. However, the record actually points to a review of the book published
in Revista Española de Documentación Científica. The second record (Figure 9;
bottom), is also a review of the book which was published in Aslib Proceedings.
In this case, the author of the review is the one who appears in the GS record
(Brookes).
Figure 9. Authorship and attribution of book reviews
EC3 Working Papers Nº 21
35
c) Incorrect attribution of documents to authors
Somewhat related to the previous error is the attribution of a document to the
wrong authors. In Figure 10 we observe a special case: the book ―Introduction
to informetrics. Quantitative methods in Library, Documentation and Information
Science‖ by Egghe and Rousseau, is wrongly attributed to Tague-Sutcliffe,
probably because this author has a short publication in the journal Information
Processing & Management (Figure 10; bottom) with a similar title (An
introduction to informetrics).
Figure 10. Authorship improperly assigned in Google Scholar
d) Failing to merge all versions of a same document into one record
Although the algorithms for grouping versions work well in most cases, Google
Scholar sometimes fails to realize that two or more records it has indexed
actually represent the same document. This happens when there are enough
formal differences between the metadata of the two versions (differences in the
way the name of the authors have been stored, in the title, the year of
publication…), that Google Scholar judges they‘re not similar enough to be the
same document. This issue mostly affects document types other than journal
articles (books, book chapters, reports), but duplicate articles also exist. Articles
translated into one or more languages are an extreme example: in those cases,
the title of the original version is completely different to that of the translated
version, so it is understandable that Google Scholar doesn‘t realize they are the
same document. From a bibliometric perspective, however, their citation counts
shouldn‘t be split.
This issue obviously affects the citation count of some documents. In Figure 11
we can observe how this phenomenon affects a book chapter: ―Measuring
science‖, by Van Raan.
EC3 Working Papers Nº 21
36
Figure 11. Versions of book chapters improperly tied in Google Scholar
d) Grouping different editions of the same book in a single record
Conversely to the previous error, Google Scholar sometimes groups together
records that should stay separate, for example in the cases when there are
different editions of the same book (a new book edition provides new content,
contrary to a reprinting of a book, which is identical to the previous printing).
Figure 12 illustrates the case of Little Science, big Science, written by Price.
This book was first published in 1963 by Columbia University Press, and
reedited in 1986 under the title ―Little science, big science… and beyond‖, an
edition that contained the original text of the book, as well as seven of his most
famous articles. Figure 12. Different book editions tied in Google Scholar
EC3 Working Papers Nº 21
37
The primary version (which has received 4,130 citations) is the edition from
1986, but among its versions are several records pointing to the version from
1963. Different editions of the same book should be treated as separate
documents when computing citations because their content may be very
different.
Of course, aumatically detecting and managing these details is a very complex
task, and only a very tiny fraction of the documents indexed in Google Scholar
(the most influential manuals and seminal works) would benefit from this
thorough treatment. We must not forget that Google Scholar is, first of all, a
search tool devoted to helping researchers find academic information. A great
percentage of users probably don‘t care about the different editions of a book,
and those who do probably just want the most recent one. That may be the
reason why Google Scholar usually displays the most recent edition of a book
as the primary version. The use of separate entries for different editions is
something just a few people, like librarians, would be interested in.
In any case, this may have an important effect in citation counts because
citations to different editions (providing different content) are added together. In
Figure 13 we can see how the 1986 edition of the book is receiving citations
that were actually made to the original work published in 1963.
Figure 13. Citations to different book editions tied in Google Scholar
e) Improper attribution of citations to a document
Document citation counts in Google Scholar are also affected by the attribution
of ―ghost‖ citations to documents, that is, citations that aren‘t actually there
when we examine the citing document. Figure 14 shows an example of this
issue: the work Le transfert de l'information scientifique et technique: le rôle
des nouvelles technologies de l'information face à la crise du modèle actuel de
communication écrite has allegedly received eight citations, but if we manually
examine the second document in the list (marked in red), we can‘t find any
EC3 Working Papers Nº 21
38
mention of the cited work. This phenomenon has been frequently observed in
documents stored in the E-LIS repository.
3
Figure 14. Appearance of false citations
f) Duplicate citations
This phenomenon is a consequence of an issue previously discussed. When
Google Scholar fails to realise that two records are actually versions of the
same document, these versions are stored as if they were different documents.
Therefore, each of them provides its own set of citations to the citation pool.
Since the two sets of citations are probably identical, each cited document will
receive two citations from what is actually only one document, thus falsely
inflating their citation counts.
In Figure 15 we observe a double example of this phenomenon. In the first case
(first red rectangle), there are three versions of the same document. Note the
differences in the way the authors‘ names are stored, since this is probably the
reason why the records weren‘t merged into one. In the second case (second
red rectangle), the two records refer to the same document (the first one is the
English version of the article, and the second one is the Spanish version).
3
http://eprints.rclis.org
EC3 Working Papers Nº 21
39
Figure 15. Duplicate citations in Google Scholar
g) Missing citations
There are cases when Google Scholar‘s parser fails to match a cited reference
inside document, with the record of the document it is citing. When Google
Scholar parses the reference section within an article, it tries to find a match for
these references in its records, but if for some reason the reference hasn‘t been
correctly recorded (authors of the citing article may have made a mistake when
citing it or used an uncommon reference format Google Scholar doesn‘t
understand) the system will be unable to make the connection between the two
documents.
However, we also find examples in which no apparent mistake has been made
in the citing document, but still the citation isn‘t attributed to the cited document.
In order to illustrate this issue, in Figure 16 we show how a document (―How to
cook the university rankings‖) is citing in its reference section other document (a
doctoral thesis). However, this citation doesn‘t appear as one of the 13 citations
that the thesis has received according to Google Scholar. The reason is
unknown. At the time the citing document was first indexed, the connection
wasn‘t made for some reason, and this error hasn‘t been solved since. Typos in
the PDF can also generate this kind of error.
EC3 Working Papers Nº 21
40
Figure 16. Citations unrevealed in Google Scholar
All the errors previously described are related directly with the Google Scholar
database (and are concerned with how the automatic parser works). Next we
show some of the mistakes identified in the elaboration of bibliographic profiles
through Google Scholar Citations:
a) Duplicate profiles
Since the only restriction to create a public academic profile in Google Scholar
Citations is to provide a valid email, an author (or anyone really) may create as
many profiles as he/she wants. This opens the door to the existence of
duplicate profiles, that is, different profiles about the same person. In Figure 17
we present some examples of duplicate profiles of authors related to the field of
Bibliometrics. The differences in citation counts between profiles are sometimes
quite high (for example, one of the profiles belonging to Ruiz-Castillo achieves
1,843 citations whereas in the second profile the figure goes up to 2,430).
Figure 17. Duplicate profiles in Google Scholar Citations
A real problem can arise when one of the profiles has been created by
someone other than the author the profile is about. The author may send a
EC3 Working Papers Nº 21
41
request to Google Scholar to delete the profile, but this kind of requests might
take a while to be processed, generating a feeling of helplessness in the author.
b) Variety of document types (including non-academic documents)
One of the main criticisms to the profiles in Google Scholar Citations (when
considering whether they‘re suited for evaluation purposes) is the inclusion of a
wide variety of document types: from peer-reviewed articles to posters. An
author can add any kind of work to his profile, and sometimes they aren‘t even
academic works: teaching materials, software, online resources, etc. (Figure
18).
While this is a true shortcoming from the research evaluation perspective, these
profiles are designed to showcase any material that the author considers
appropriate, especially if these materials could potentially generate some kind
of impact through citations. The possibility to select the document typology (as
ResearchGate does) may help solve this problem. However, the selection of
document type is only an internal mechanism not reflected in the public profile.
Figure 18. Teaching materials in Google Scholar Citations
c) Inclusion of missattributed documents in the profile
The Google Scholar team doesn‘t oversee the validity of all the information
available in Google Scholar Citations. Therefore, it is the sole responsibility of
the author that the information visible in his/her profile is accurate. Profiles can
be set to be updated automatically (when the system finds an article that it‘s
reasonably sure it‘s yours, it is automatically added to your profile), or by asking
the author for confirmation first when the system thinks an addition or a change
should be made. If the user selects the automatic updates, there is a risk that
the system will add documents to the profile that the author hasn‘t actually
written, thus falsely increasing the author‘s bibliometric indicators. The author
will probably be completely oblivious to this issue if he or she doesn‘t check the
profile regularly. If that is the case, it shouldn‘t be considered an active attempt
to fake one‘s bibliometric indicators, but it is still a matter that should be fixed as
soon as it comes to the author‘s knowledge. In Figure 19 we can see an
example: the third document (marked in red), which has received 40 citations,
hasn‘t been written by the owner of the profile (Imma Subirats-Coll).
EC3 Working Papers Nº 21
42
Figure 19. Misattributed documents in Google Scholar Citations
We can find examples where the owner of the profile has participated as a
translator or editor of a work (Figure 20). The assignation of the citation counts
of a work to the people who have fulfilled this kind of roles is controversial. At
the very least, they should make sure that their role is clearly stated and visible
in the profile.
Figure 20. Edition and translation roles in Google Scholar Citations
d) Deliberate manipulation of documents and citations in Google Scholar
Another issue is that of the conscious manipulation of profiles by their owners.
The fact that anyone, without advanced technical skills, can manipulate his/her
own bibliometric indicators, or other people‘s (Delgado LópezCózar,
RobinsonGarcía & TorresSalinas, 2014) may affect the credibility of GSC
academic profiles if no action to control this issue is taken by the Google
Scholar team. In Figure 21 we observe how uploading a set of fake documents
to a repository (with nonsensical text, and a list of references which include the
set of documents whose impact you want to boost) will, in just a few days,
cause the desired adulteration of citation scores in the profiles of the authors of
the referenced documents.
EC3 Working Papers Nº 21
43
Figure 21. Effect of data manipulation in Google Scholar Citations
Source: Delgado López
Cózar, Robinson
García & Torres
Salinas, 2014
e) Duplicate documents in profiles
This is also a side effect of the cases when Google Scholar fails to group
together different versions of the same document. The consequence for the
profiles is that the different versions will also be added as different records in
the profile, which might affect (positively or negatively) indicators like the h-
index and the i-index, which are computed automatically. Fortunately, profile
users can manually merge records in their profile, which will solve this issue
(Figure 22). This merge only affects the author‘s profile. It doesn‘t alter Google
Scholar search query results in any way, that is, there will still be two (or more)
records for that document in Google Scholar‘s index, at least until the error gets
fixed in a future update.
Figure 22. Versions not tied in Google Scholar Citations
EC3 Working Papers Nº 21
44
f) Incorrectly merged documents
The downside to the fact that an author can freely merge documents in his/her
profile is, obviously, that incorrect merges (of different documents) can also be
made. As we discussed before, Google Scholar doesn‘t run any validity or
accuracy checks on the information displayed in these profiles. Of course, this
can also have a distorting effect on the automatically generated author-level
indicators.
Figure 23. Incorrectly merged records in Google Scholar Citations
g) Unclean document titles
This error is also inherited from Google Scholar‘s metadata parsing errors.
Google Scholar Citations allows authors to modify almost all aspects of a record
in their profile, including the title of the documents. Unfortunately, not all authors
pay attention to such details, and so these errors persist (Figure 24).
Figure 24. Parse errors in identifying document titles in Google Scholar Citations
EC3 Working Papers Nº 21
45
h) Missing or uncommon areas of interest
One last limitation that may affect the results of this Working Paper is related to
the areas of interest declared by the authors in their profiles (a maximum of five
areas can be provided). Researchers in bibliometrics with a public profile in
Google Scholar Citations, but haven‘t declared any area of interest (Figure 24,
top), those who use uncommon keywords, or keywords in a language other
than English (Figure 25, bottom) may have been overlooked.
Figure 24. Missing (top) and uncommon (bottom) areas of interest in Google Scholar Citations
3.4.2. ResearcherID
One of the main shortcomings that characterize ResearcherID is the need to
manually update the profiles. An author needs to synchronize his/her account
with a search in Web of Science Core Collection in order to update the list of
publications, unlike in Google Scholar and ResearchGate, where the process is
largely carried out by the system, and authors only need to confirm new
additions or modifications when the system prompts them to do so.
The fact that active manual intervention is needed on the author‘s part to keep
the profile up to date results in a very inconsistent set of data. Authors
concerned with online visibility will regularly update their profile, but in the
majority of cases, authors will rarely visit their profile again after setting it up the
first time. This may explain the results previously shown in Figure 5.
Moreover, we have found additional shortcomings in the system, caused by
incorrectly attributed citations in Web of Science, which affect ResearcherID
profiles.
Let‘s illustrate this issue with an example in which Dr. Eugene Garfield will be
our test subject. In figure 25 we can see the citation metrics for Eugene
Garfield‘s academic profile according to ResearcherID, which displays the
number of articles published, the sum of times cited, the h-index, and other
bibliometric indicators based on data from Web of Science Core Collection.
Since Dr. Garfield hasn‘t created a Google Scholar Citations profile for himself,
we generated a private profile in GSC (only accessible by us) in order to
compare the indicators provided by the two profile platforms. A screenshot of
this profile can be seen in Figure 26.
EC3 Working Papers Nº 21
46
Figure 25. Eugene Garfield’s academic profile in ResearcherID
Figure 26. Eugene Garfield’s academic profile in Google Scholar Citations
As we can see, there is a huge difference between Dr. Eugene Garfield's h-
index according to ResearcherID (154) and his h-index according to Google
Scholar (55). This is caused by a technical error in the data provided by Web of
Science. Dr. Garfield's ResearcherID profile contains a great number of works
published in Current Contents, many of them with exactly 200 citations (Figure
27), an odd phenomenon. There is another large group of documents with
exactly 155 citations, and other groups of documents which also share the
same number of citations.
EC3 Working Papers Nº 21
47
Figure 27. Eugene Garfield’s publication view in ResearcherID
The examination of any of these documents on the Web of Science database
reveals that all these citations have been incorrectly attributed. In fact, there are
some cases where, according to Web of Science, a document cites itself
(Figure 28). The cause for this error is yet unknown to us and further research is
needed to ascertain how often this kind of error occurs throughout the Web of
Science database.
Figure 28. Eugene Garfield’s citing articles in Web of Science
3.4.3. Mendeley
An unusual phenomenon was detected while perusing some bibliometricians‘
profiles in Mendeley: many papers published in the Journal of the American
Society for Information Science and Technology had abnormally high reader
counts (number of Mendeley users who have saved a certain paper to their
collection of references). On November 6th, 2015, a group of JASIST articles all
exhibited exactly 5,074 readers. Figure 29, a snapshot taken from Mike
Thelwall‘s Mendeley profile, illustrates this phenomenon.
EC3 Working Papers Nº 21
48
Figure 29.
Mike Thelwall’s publications with incorrect reader counts in Mendeley
The immediate cause of this issue seems to be that all of these articles had
been incorrectly linked to the same paper (Figure 30), which had precisely
5,074 readers. This paper - which doesn‘t have anything to do with the JASIST
articles shown previously - could be accessed by clicking on any of the titles of
the JASIST papers from their authors‘ profiles. The technical reason why this
could‘ve happened is yet unknown.
Figure 30. Publication causing readership metrics misleading in Mendeley
The fact is that this phenomenon has affected several researchers in our study,
greatly distorting their aggregate reader counts. The most noticeable case is
that of Dr. Mike Thelwall, who has 23 articles affected by this issue in his
personal profile, rising his aggregate reader count to 118,046 readers on
November 6th (Figure 31), much higher than the count we collected on
September (7,423). The error hasn‘t been fixed yet, and this count keeps
growing every day (144,319 by January the 14th, 2016).
EC3 Working Papers Nº 21
49
Figure 31. Mike Thelwall’s personal profile metrics in Mendeley (6th November 2015)
Lastly, it is important to note that if you search any of these documents directly
on Mendeley’s search feature, the results show the correct (or at least more
plausible) reader count for the articles (Figure 32).
Figure 32. Direct search of documents in Mendeley
Apart from these anomalous readership metrics in Mendeley (that should be
understood as an anecdotal mistake that Mendeley will fix soon), we have found
other malfunctions caused by errors in the metadata of the references added to
the platform, which also affect readership metrics.
In Figure 32 we can see how one author (Arvid Kappas) is missing from one of
the two versions of the article ―Sentiment in short strength detection informal
text‖. Probably for this reason, Mendeley didn‘t consider them to be the same
document, and thus, at some point it created a second record for the document
instead of merging it with the version it already had. This, in turn, meant that the
reading counts would be split between the two versions of the document (a
similar scattering effect to the one found Google Scholar Citations with versions
and citations, as we previously described).
Not only incorrect metadata can lead to erroneous reader counts, missing
metadata can also be dangerous. In Figure 33, taken from Zhigang Hu‘s
Mendeley profile on November the 6th, 2015, there are examples of both
EC3 Working Papers Nº 21
50
incorrect metadata (the first article) and missing metadata (the second article)
leading to inaccurate reader counts.
Figure 33. Documents with incorrect or missing metadata affecting Mendeley reader counts
In the first case, the title of one of this researcher‘s articles wasn‘t correctly
parsed from the PDF of the article, and an incorrect string was selected as the
title instead. This is a relatively common issue, so all the articles which have
been incorrectly parsed in a similar way and share the same incorrect title
―Metadata of the article that will be visualized in OnlineFirsthave been lumped
together by Mendeley, which explains the high reader count for that article. The
same explanation could probably be applied to the second document. All
documents with a missing title or with the incorrect title ―No Title‖ must have
been merged by Mendeley to obtain such a high reader count (55,893).
3.4.4. ResearchGate
ResearchGate (RG), the academic profiling and sharing platform created by Dr.
Ijad Madisch
4
and Dr. Sören Hofmayer
5
in 2008, is currently gaining momentum
as one most used services of this kind among researchers. In May 2015 they
announced they had reached 7 million users,
6
and just five months later, in
October, they claimed to have reached 8 million.
7
The reasons behind the success of this platform are undoubtedly related to the
constant stream of new (and usually very convenient) features the platform has
been introducing during the past months, but probably also to the constant flow
of ego-boosting e-mails that users receive informing them about the great
impact their work is having on the scientific community.
4
https://www.researchgate.net/profile/Ijad_Madisch
5
https://www.researchgate.net/profile/Soeren_Hofmayer
6
https://www.researchgate.net/blog/post/celebrating-seven-million-members-and-seven-years-of-
researchgate
7
https://www.researchgate.net/blog/post/8-out-of-8-million
EC3 Working Papers Nº 21
51
Like the rest of platforms fulfilling similar needs, RG computes a set of
indicators which are designed to measure the popularity, impact, and degree of
use of the documents a researcher uploads to the system (Thelwall and
Kousha, 2015). In section 3.3 we observed how these metrics (especially the
RG Score) achieved a high correlation with impact metrics provided by Google
Scholar Citations (especially total citations and h-index). Moreover, this platform
was, at the moment we collected the data, the only one who provided both
citation and usage metrics for articles (until the Web of Science began to offer
usage metrics in November 2015). All these impressive results are partly a
consequence of this momentum in terms of user growth.
However, we must point out some important shortcomings related to the lack of
transparency in the way all these metrics are computed, a lack of transparency
that makes them currently unsuitable for scientific evaluation. It looks like
ResearchGate is acting like a modern ―alchemist‖, in the sense that it produces
its own "concoctions", but without revealing their ingredients and method of
preparation to anyone, an issue that, of course, has not gone unnoticed by the
scientific community.
8
First, we may consider the RG Score, which is the indicator they display more
prominently in the researchersprofiles, situated right next to the name of the
researcher. According to ResearchGate
9
, this author-level indicator measures
scientific reputation based on how all of your research is received by your
peers‖. The main concern with this indicator - in terms of usefulness for
scientific evaluation - is that the way it‘s calculated hasn‘t been made public.
Therefore, even though this indicator may be a good way to attract researchers
who enjoy going on ego trips once in a while, the fact that only ResearchGate
knows how to calculate it renders it ill-suited for research assessment before
the discussion about its intrinsic merits and defects can even begin.
Another matter is that, at the end of September 2015, that is, a few weeks after
we collected our data about bibliometric researchers (results offered in sections
3.1 and 3.2), ResearchGate combined two of the indicators they used to display
on its users‘ profiles (document views and downloads) into one (Reads).
10
According to them, ―a read is counted each time someone reads the summary
or full-text, or downloads one of your publications from ResearchGate‖.
However, the document views‖ and download countscollected in September
don‘t match the read counts available after that change (Table 10). We can
easily see how ―Reads‖ are clearly lower than the combination of downloads
and views. The separation of summary views and document views may have
something to do with this issue, and it‘s a matter that should be further
analyzed.
8
http://blogs.lse.ac.uk/impactofsocialsciences/2015/12/09/the-researchgate-score-a-good-example-of-a-
bad-metric
9
https://www.researchgate.net/publicprofile.RGScoreFAQ.html
10
https://www.researchgate.net/blog/post/introducing-reads
EC3 Working Papers Nº 21
52
Table 10. Top 10 authors with the highest Reads counts on ResearchGate (9th of November, 2015),
compared to their Downloads and Views counts on the 10th of September, 2015.
AUTHOR NAME
SEPTEMBER 10th
(2015)
NOVEMBER 9th
(2015)
MISMATCH
(%)
DOWNLOADS
VIEWS
READS
Loet Leydesdorff
32,165
42,926
21,013
27.98
Mike Thelwall
24,989
34,376
17,748
29.90
Chaomei Chen
31,579
26,734
13,452
23.07
Nader Ale Ebrahim
31,853
23,144
10,282
18.70
Lutz Bornmann
13,556
22,987
9,863
26.99
Maite Barrios
14,234
7,600
9,439
43.23
Wolfgang Glänzel
10,572
20,145
9,439
30.73
Félix Moya Anegón
18,691
23,583
8,625
20.40
Cassidy Sugimoto
13,079
8,081
8,458
39.97
Ronald Rousseau
8,066
19,118
6,934
25.51
The same thing can be said about the ―profile views‖ indicator: the counts
obtained back in September are always higher than the ones available two
months later on November the 9th (Table 11). To the best of our knowledge,
there has not been an announcement regarding any changes in the profile
views indicator.
Table 11. Top 10 authors with the highest profile view counts on ResearchGate (9th of November,
2015), compared to the same indicator on the 10th of September, 2015.
AUTHOR
NAME
SEPTEMBER 10th
(2015)
NOVEMBER 9th
(2015)
MISMATCH
(%)
PROFILE
VIEWS
PROFILE
VIEW
Nader Ale Ebrahim
19,821
13,281
67.00
Chaomei Chen
7,760
3,937
50.73
Loet Leydesdorff
4,227
1,758
41.59
Bakthavachalam Elango
2,883
1,756
60.91
Zaida Chinchilla
5,840
1,569
26.87
Mike Thelwall
4,297
1,568
36.49
Lutz Bornmann
3,129
1,439
45.99
Wolfgang Glänzel
3,012
1,301
43.19
Kevin Boyack
3,256
1,135
34.86
Peter Ingwersen
2,335
1,025
43.90
In any case, a high Pearson correlation between the sum of Downloads and
Views, and the new Reads indicator (r= 0.93, n = 499; α = 0.95; p-value < 2.2e-
16) is observed; and also between the Profile View counts collected in
September and the ones collected in November (r= 0.93; n = 535; α = 0.95; p-
value < 2.2e-16).
3.4.5. General strengths and shortcomings of academic profiles
Lastly, Table 12 summarizes the main strengths and weaknesses of each of the
platforms analyzed in this study.
EC3 Working Papers Nº 21
53
Table 12. Advantages and disadvantages of academic profiles provided by social platforms
GOOGLE SCHOLAR CITATIONS
ADVANTAGES
DISADVANTAGES
Widest coverage (all languages,
sources and disciplines)
User-friendly
High growth rate
Automatic updates
Alerts (new citations to your work, or
publications from other authors)
Scarce quality control
Open to manipulation
Inherits mistakes from Google Scholar
RESEARCHERID
ADVANTAGES
DISADVANTAGES
Offers advanced bibliometric indicators
No automatic updates
Not very user-friendly
Inherits mistakes from WoS
Not used by many authors
Only WoS CC publications count
towards citation metrics
RESEARCHGATE
ADVANTAGES
DISADVANTAGES
Increasingly used by the scientific
community: very high growth rate
Offers usage data (views and
downloads)
User-friendly
Correlates with citation data
Social functions to contact other
authors
No automatic updates (one co-author
must upload the document)
Lack of transparency in its indicators
Still not used by many authors
Sends too many e-mails (by default)
MENDELEY PROFILES
ADVANTAGES
DISADVANTAGES
Increasingly used by community
Offers usage data (reads)
Correlates with citation data
Allows discipline analysis
Social functions (follow other authors)
No automatic updates
Quality of metadata depends on user
input
It is clear that none of the platforms considered and analyzed in this Working
Paper is without its problems and limitations. At the same time, all of them offer
new insights for measuring scientific impact.
Google Scholar offers the widest coverage, situated on approximately 160
million hits on May 2014 (Orduna-Malea et al, 2014). Its indexing criteria (all
academic documents openly stored in the academic web space) makes this
database the only place where every academic document is indexed regardless
of its typology (not only journal articles but also books, book chapters, reports,
thesis dissertations, conference proceedings, etc.), its language, or its
discipline. Thanks to this wide variety of sources, Google Scholar is able to
measure not only scientific but also educational and professional impact in the
broadest sense of the term. At the same time, as regards strict scientific impact,
there is a high correlation (r = 0.8) between the number of citations of these
documents in GS and their citations in WoS (Martin-Martin et al, 2014).
Google Scholar Citations includes citation scores for authors, areas of interest,
and institutional information. Additionally, in this platform, the owner of the
profile can improve the bibliographic information provided by Google Scholar,
and merge duplicates Google Scholar hasn‘t been able to detect. This
impressive collection of data, together with the development of functionalities
(such as detecting and merging duplicates), makes Google Scholar the best
EC3 Working Papers Nº 21
54
tool for the bibliometric analysis of some disciplines, especially those within the
areas of the Humanities, Social Sciences, and Engineering.
Unfortunately, Google Scholar is not without its problems. The possibility to edit
records in the profiles does not solve its parsing problems, for which there
doesn‘t seem to be a clear explanation sometimes. We must point out however
that the system is improving year by year. Moreover, in an academic big data
environment, these errors (which we deem affect less than 10% of the records
in the database) are of no great consequence, and do not affect the core
system performance significantly.
On the other hand, the philosophy of the product (oriented to the user, lacking
any bibliographic control) makes the tool rather open to confusing data,
mistakes (described in section 3.4.1), and to manipulation, a really serious
problem in the academia at the moment. Scientific misconduct should not be
disregarded as mere spam.
Moreover, Google Scholar is user-friendly but not bibliometrician-friendly.
Google Scholar‘s agreements with big publishers to collect data from their
servers and present them in the search engine come at a price: among other
things, the impossibility of offering an API which would no doubt be highly
welcomed by the scientific community. An API would allow us to keep working
on our understanding the production, dissemination, and consumption of
scientific information worldwide.
ResearchGate is the second most-used platform among the tools analyzed in
this work. The high number of users that this platform is currently attracting
reinforces the validity of the metrics it provides (essentially because of the great
amount of documents that have been already uploaded to the system). This is
reflected in the extraordinary correlation that RG Score achieves with the h-
index and total citations from Google Scholar. Moreover, there is no better
platform to calculate number of downloads per document.
We believe this is a logic result, because the RG Score is basically made up of
the number of publications an author has published, the citations to these
publications, and the JCR Impact Factor of the journals where these articles are
published. Usage indicators may also have some weight, but not much yet.
Nonetheless, the lack of transparency in the calculation of the different metrics
(especially the RG Score) prevents it from being useful, since they cannot be
replicated.
This the reason why the following questions still arise: what was ResearchGate
really measuring before the changes in the View and Download indicators took
place? What is it really measuring now? Why isn‘t ResearchGate more open
about the way it computes the indicators they display?
Moreover, the introduction of subjective values (such as the participation in
question & answers in the platform) may introduce some bias (high participation
in the social platform does not have anything to do with academic impact,
EC3 Working Papers Nº 21
55
though it serves to incentive the use of the platform). In any case, the weight of
this parameter doesn‘t seem to be significant.
Likewise, changes in the company policies, such as the elimination of some
services (the complete list of documents ranked according to number of reads is
no longer available), makes this platform unpredictable and unreliable at the
moment. Other specific limitations are related to the quantity of documents
indexed in the platform; references not properly identified, or incorrectly
attributed citations.
Regarding Mendeley, we should acknowledge the validity of the Readers
indicator, which strongly correlates to both the Downloads indicator provided by
ResearchGate (different sides of usage) and to citation-based metrics from
Google Scholar. However, we found some limitations in this platform while
studying the Bibliometric community (which may be extrapolated to other
academic communities).
First, calling the number of users that have saved a bibliographic record in their
personal collection ―readers‖ is absolutely incorrect (Delgado López-Cózar and
Martín-Martín, 2015). The term should be changed to one that more accurately
represents the nature of the indicator, because the current one can lead to
misunderstandings and misinterpretations.
11
Second, the fact that there are no automatic profile updates makes the system
completely dependent on user activity. A total of 149 out of the 336 profiles
analyzed (44.3%) didn‘t include a single document (Figure 34), and only 23% of
the researchers have an effective presence in the platform. This fact strongly
limits the use of Mendeley for the purpose of evaluating authors.
Figure 34. Example of empty academic profile in Mendeley
The last academic profiling service we analyzed was ResearcherID. There is no
automatic profile updates in this platform, and a great percentage of user
profiles (34.4%) have no public publications displayed (Figure 35), that is, the
profile only contains basic information about the subject interests of the author
and its affiliation. Only 26% of the authors in our sample had a ResearcherID
profile with at least one document, and most of these profiles were out of date.
11
The new “Reads” metric provided by ResearchGate suffers from the same problem, as it is combining
online accesses to the document and downloads, which are not the same even though they claim they are.
EC3 Working Papers Nº 21
56
Figure 35. Example of an empty academic profile in ResearcherID
Apart from this lack of real use, we found several errors which had been
inherited from the citation scores available in the Web of Science. That is, WoS
is not error-free in the attribution of citation scores. For all these reasons, we do
not consider ResearcherID a valuable tool for bibliometric purposes.
4. CONCLUSIONS
Although this work is focused on the analysis of a specific academic community
(Bibliometrics), the results obtained allowed us to obtain a number of important
findings, summarized below.
Firstly, Google Scholar (with its associated platform for academic profiles
Google Scholar Citations), provides a very precise and accurate picture of the
bibliometric community. The data collected, not only at the author-level but also
at the document-level and source-level (journal and books), clearly responds to
our mental image of the field. That is, Google Scholar helped identify the most
influential authors (core and related) and sources (journals and publishers) in
the discipline.
The level of use of other social platforms is quite far from the one found for
Google Scholar Citations, not only in the number of user profiles created, but
also in the regularity with which they are updated. ResearchGate‘s growth rate
is impressive and currently stands as the second most used profile platform by
the bibliometric community. Its usage indicators (Downloads and Views) and its
social network features (communication and information sharing among users)
provide a perspective that Google Scholar Citations lacks.
The social tools analyzed here have a number of significant limitations, which
clearly get in the way of generating academic mirrors complementary to those
based merely on citations. In the case of ResearchGate these limitations are
caused by the opacity of the indicators and unexpected changes in the policies
of the company, whereas in Mendeley and ResearcherID the problems arise
from the existence outdated profiles. This issue has a negative effect on the
accuracy of the information provided by these platforms, as seen in Figure 5.
Twitter, on the other hand, presents a completely different picture. Its author-
level indicators do not correlate with citation-based indicators (from Google
Scholar) nor with usage indicators (provided by ResearchGate and Mendeley),
EC3 Working Papers Nº 21
57
but they do correlate with other network indicators (which measure an author‘s
participation in the community as well as his/her ability to connect with other
users). This lack of correlation should however not be understood negatively.
Instead, we should interpret it as a sign that it these indicators measure a
different dimension of the author‘s impact on the Web.
Two different kinds of indicators were found in these platforms: first, all metrics
related to academic performance. This first group can further be divided into
usage metrics (views and downloads) and citation metrics. Second, all metrics
related to connectivity and popularity (followers). ResearchGate provides
examples for these two sides of academic performance, since Google Scholar
Citations profiles do not offer data about downloads or reads.
In the process of conducting this analysis, we identified a series of errors that
allowed us to outline the main limitations of each product. May this serve as a
sign that this study hasn‘t been made with an intention to exalt a particular
database over the others. On the contrary, the intention was to thoroughly,
comprehensively, conscientiously, and neutrally test the possibilities of Google
Scholar as a tool for scientific evaluation.
In this sense, the empirical results indicate that Google Scholar should be the
preferred source for relational and comparative analyses in which the emphasis
is put on author clusters. Individual data should be taken with some caution as it
may be subject to some errors. Despite these errors (as well as the lack of more
advanced filtering features), Google Scholar has been able to measure the
academic community dedicated to measuring; and has done it successfully:
detecting ―those who count‖ (bibliometricians).
Lastly, the results should be understood within the context of the bibliometric
community. They may be different in other academic communities where the
greater or lesser use of technologies can clearly influence the data.
Furthermore, there is a certain positive bias in the use of these platforms
because within the bibliometric community, these platforms are part of the
object of study of the discipline, as is the case of this work.
EC3 Working Papers Nº 21
58
REFERENCES
Barjak, F., Li, X., & Thelwall, M. (2007). Which factors explain the web impact of
scientists‘ personal homepages?‖ Journal of the American Society for Information
Science and Technology, 58(2), 200211.
Becher, T., & Trowler, P. (2001). Academic tribes and territories: Intellectual enquiry
and the culture of disciplines. McGraw-Hill Education (UK).
Bensman, S. J. (2007). Garfield and the impact factor. Annual Review of Information
Science and Technology, 41(1), 93-155.
Bonitz, M. (1982). Scientometrie, Bibliometrie, Informetrie‖. Zentralblatt für
Bibliothekswesen, 96 (2):1924.
Borgman, C. L. & Furner, J. (2002). Scholarly Communication and Bibliometrics‖.
Annual Review of Information Science and Technology, 36, 3-72.
Braun, T. (1994). Little scientometrics, big scientometrics and beyond?‖
Scientometrics, 30, 373537.
Broadus, R.N. (1987a). Early approaches to bibliometrics‖. Journal of the American
Society for Information Science, 38, 127129.
Broadus, R. N. (1987b). Toward a definition of ‗bibliometrics‘‖. Scientometrics, 12,
373379.
Brookes, B. C. (1988). ―Comments on the scope of bibliometrics‖. In: L. Egghe, R.
Rousseau (Eds). Informetrics 87/88. Select Proceedings of the First International
Conference on Bibliometrics and Theoretical Aspects of Information Retrieval.
Amsterdam, Elsevier Science, 2941.
Brookes, B. C. (1990). ―Biblio-, Sciento-, Infor-metrics??? What are we talking about?‖.
In: L. Egghe, R. Rousseau (Eds). Informetrics 89/90. Selection of Papers Submitted
for the Second International Conference on Bibliometrics, Scientometrics and
Informetrics. Amsterdam, Netherlands, Elsevier, 3143.
Castells, M. (2002). La galaxia internet. Barcelona: Plaza & Janés.
Cronin, B. (2001). Bibliometrics and beyond: some thoughts on web-based citation
analysis. Journal of Information science, 27(1), 1-7.
De Bellis, N. (2009). Bibliometrics and citation analysis: from the science citation index
to cybermetrics. Maryland: Scarecrow Press.
Delgado López-Cózar, E. & Martín-Martín, A. (2015). ―Thomson Reuters coquetea con
las altmetrics: usage counts para los artículos indizados en la Web of Science‖. EC3
Working Papers, 20.
Delgado LópezCózar, E., RobinsonGarcía, N. & TorresSalinas, D. (2014). The
Google Scholar Experiment: how to index false papers and manipulate bibliometric
indicators. Journal of the Association for Information Science and
Technology, 65(3), 446-454.
Franceschini, F., Maisano, D. & Mastrogiacomo, L. (2015). Research quality
evaluation: comparing citation counts considering bibliometric database errors‖.
Quality & Quantity, 49(1), 155-165.
GarcíaPérez, M. A. (2010). Accuracy and completeness of publication and citation
records in the Web of Science, PsycINFO, and Google Scholar: A case study for the
computation of h indices in Psychology. Journal of the American Society for
Information Science and Technology, 61(10), 2070-2085.
Garfield, E. (1983). Idiosyncrasies and errors, or the terrible things journals do to us.
Current Contents, 2, 5-11
Garfield, E. (1990). Journal editors awaken to the impact of citation errors-how we
control them at ISI. Current Contents, 41, 5-13
Glänzel, W. & Schoepflin, U. (1994). ―Little scientometrics, big scientometrics and
beyond? Scientometrics, 30, 375384.
Godin, B. (2006). On the origins of bibliometrics. Scientometrics, 68(1), 109-133.
EC3 Working Papers Nº 21
59
González-Díaz, C.; Iglesias-García, M.; Codina, L. (2015). ―Presencia de las
universidades españolas en las redes sociales digitales científicas: caso de los
estudios de comunicación‖. El profesional de la información, 24(5), 640-647.
Gorbea Portal, S. (1994). Principios teóricos y metodológicos de los estudios métricos
de la información. Investigación Bibliotecológica, 8, 23-32.
Haustein, S., Peters, I., Bar-Ilan, J., Priem, J., Shema, H. & Terliesner, J. (2014).
Coverage and adoption of altmetrics sources in the bibliometric
community. Scientometrics, 101(2), 1145-1163.
Hertzel, D.H. (1987). History of the development of ideas in bibliometrics. In: A. Kent,
(Ed.). Encyclopedia of library and information sciences, Vol. 42 (Supplement 7),
Marcel Dekker, New York, 144219
Hood, W. & Wilson, C. (2001). The literature of bibliometrics, scientometrics, and
informetrics. Scientometrics, 52(2), 291-314.
Jacsó, P. (2005). Google Scholar: the pros and the cons. Online information
review, 29(2), 208-214.
Jacso, P. (2006a). Deflated, inflated and phantom citation counts. Online information
review, 30(3), 297-309.
Jacsó, P. (2006b). Dubious hit counts and cuckoo's eggs. Online Information
Review, 30(2), 188-193.
Jacsó, P. (2008). Google scholar revisited. Online information review, 32(1), 102-114.
Jacsó, P. (2010). Metadata mega mess in Google Scholar. Online Information
Review, 34(1), 175-191.
Jamali, H. R., Nicholas, D. & Herman, E. (2015). Scholarly reputation in the digital age
and the role of emerging platforms and mechanisms. Research Evaluation, rvv032.
Kramer, Bianca; Bosman, Jeroen (2015): 101 Innovations in Scholarly Communication
- the Changing Research Workflow. Available at:
https://101innovations.wordpress.com/
Larivière, V. (2012). The decade of metrics? Examining the evolution of metrics within
and outside LIS. Bulletin of the American Society for Information Science and
Technology, 38(6), 12-17.
Larivière, V., Sugimoto, C. & Cronin, B. (2012). A bibliometric chronicling of library and
information science‘s first hundred years‖. Journal of the American Society for
Information Science and Technology, 63(5), 997-1016.
Lawani, S. M. (1981), Bibliometrics: its theoretical foundations, methods and
applications‖. Libri, 31, 2943
Martín-Martín, A., Orduña-Malea, E., Ayllón, J.M. & Delgado López-Cózar, E. (2014).
―Does Google Scholar contain all highly cited documents (1950-2013)?‖. EC3
Working Papers, 19.
Más-Bleda, A. & Aguillo, I. F. (2013). Can a personal website be useful as an
information source to assess individual scientists? The case of European highly
cited researchers. Scientometrics, 96(1), 51-67.
Mas-Bleda, A., Thelwall, M., Kousha, K. & Aguillo, I. F. (2014). Do highly cited
researchers successfully use the social web?. Scientometrics, 101(1), 337-356.
McCain, K. W. (2010). ―The view from Garfield‘s shoulders: Tri-citation mapping of
Eugene Garfield‘s citation image over three successive decades‖. Annals of Library
and Information Studies, 57, 261-270.
Mikki, S., Zygmuntowska, M., Gjesdal, Ø. L. & Al Ruwehy, H. A. (2015). Digital
Presence of Norwegian Scholars on Academic Network SitesWhere and Who Are
They?. PloS one, 10(11), e0142709.
Moed, H. F. & Vriens, M. (1989). Possible inaccuracies occurring in citation analysis.
Journal of Information Science, 15(2), 95-107.
Narin, F. & Moll, J.K. (1977). "Bibliometrics". Annual Review of Information Science
and Technology, 12, 35-58.
EC3 Working Papers Nº 21
60
Nicolaisen, J. & Frandsen, T. F. (2015). "Bibliometric evolution: Is the journal of the
association for information science and technology transforming into a specialty
Journal?". Journal of the Association for Information Science and Technology, 66(5),
1082-1085.
Orduna-Malea, E., Ayllón, J. M., Martín-Martín, A. & López-Cózar, E. D. (2015).
Methods for estimating the size of Google Scholar. Scientometrics,104(3), 931-
949.
Peritz, B.C. (1984). On the careers of terminologies; the case of bibliometrics‖, Libri,
34: 233242
Poyer, R. K. (1979). Inaccurate references in significant journals of science. Bulletin
of the Medical Library Association, 67(4), 396.
Sengupta, I.N. (1992). Bibliometrics, informetrics, scientometrics and librametrics: an
overview‖, Libri, 42, 7598.
Shapiro, Fred R. (1992). Origins of Bibliometrics, Citation Indexing, and Citation
Analysis: The Neglected Legal Literature. Journal of the American Society for
Information Science, 43(5), 33739.
Sher, I. H., Garfield, E., & Elias, A. W. (1966). Control and Elimination of Errors in ISI
Services. Journal of Chemical Documentation, 6(3), 132-135.
Thelwall, M. (2008). Bibliometrics to webometrics. Journal of Information Science,
34(4), 605-621.
Thelwall, M., & Kousha, K. (2015). ResearchGate: Disseminating, communicating, and
measuring Scholarship?‖. Journal of the Association for Information Science and
Technology, 66(5), 876-889.
Van Raan, A. (1997). Scientometrics: State-of-the-art. Scientometrics, 38(1), 205-
218.
Van-Noorden, R. (2014). ―Online collaboration: Scientists and the social network‖.
Nature news, 512(7513), 126-129.
White, H. D. & McCain, K. W. (1998). Visualizing a Discipline: An Author Co-Citation
Analysis of Information Science, 19721995. Journal of the American Society for
Information Science, 49(4), 327-355.
White, H.D. & McCain, K.W. (1989). Bibliometrics. Annual review of information
science and technology, 24, 119-186.
Whitley, R. (1984). The intellectual and social organization of the sciences. UK: Oxford
University Press.
Wilson, C.S. (1999). Informetrics. Annual Review of Information Science and
Technology, 34, 107-247.
... Platform platforms such as ResearchGate (O'Brien, 2019), Academia.edu (Ovadia, 2014) and identifiers such as Orcid (Haak et al., 2012) and ResearcheriD (Martín-Martín et al., 2016) are also encouraged. ...
Article
Full-text available
This paper looks at the reasons behind internationalization in Brazilian Higher Education Institutions (HEIs) and different mechanisms that can be used by the institutions themselves, financing agencies, as well as individual researchers to improve the international impact of their research and the production of quality alumni to solve problems posed by society. We question why institutions and researchers should consider the internationalization of research and teaching, and an operational approach is proposed. The need for strategic priorities and partnerships is highlighted as well as methods for monitoring and evaluating these methods. We also show that institutions should question how initiatives fit within the institution's mission and overall strategy, and how they can go about framing and deciding upon various forms of internationalization. Keywords: internationalization of scientific institutions; financing for the internationalization of education; Brazilian institutes of higher education; South-South cooperation for education; North-South cooperation for education. Este artigo analisa estratégias que podem ser utilizadas para melhorar a internacionalização das instituições científicas brasileiras. Examina as razões por trás da internacionalização nas Instituições de Ensino Superior (IES) brasileiras e diferentes mecanismos que podem ser usados pelas próprias instituições, agências financiadoras, bem como pesquisadores individuais, para melhorar o impacto internacional de suas pesquisas e a produção de ex-alunos de qualidade para resolver os problemas colocados pela sociedade. Questionamos por que instituições e pesquisadores devem considerar a internacionalização da pesquisa e do ensino, e uma abordagem operacional é proposta. A necessidade de prioridades estratégicas e parcerias é destacada, bem como métodos para monitorar e avaliar esses métodos. Mostramos também que as instituições devem questionar como as iniciativas se enquadram na missão e estratégia global da instituição, e como podem enquadrar e decidir sobre várias formas de internacionalização. Palavras-chave: internacionalização de instituições científicas; financiamento para a internacionalização da educação; institutos brasileiros de ensino superior; cooperação Sul-Sul para a educação; cooperação Norte-Sul para a educação. 206 revista tempo do mundo | rtm | n. 31 | abr. 2023 LA INTERNACIONALIZACIÓN DE LOS PROGRAMAS DE POSTGRADO BRASILEÑOS: UN ENFOQUE ESTRATÉGICO Este artículo analiza las estrategias que pueden utilizarse para mejorar la internacionalización de las instituciones científicas brasileñas. Se examinan las razones de la internacionalización en las Instituciones de Enseñanza Superior (IES) brasileñas y los diferentes mecanismos que pueden utilizar las propias instituciones, las agencias de financiación, así como los investigadores individuales, para mejorar el impacto internacional de sus investigaciones y la producción de alumni de calidad para resolver los problemas planteados por la sociedad. Se cuestiona por qué las instituciones y los investigadores deben plantearse la internacionalización de la investigación y la docencia, y se propone un enfoque operativo. Se destaca la necesidad de establecer prioridades estratégicas y asociaciones, así como métodos para supervisar y evaluar estos métodos. También mostramos que las instituciones deben preguntarse cómo encajan las iniciativas en la misión y estrategia generales de la institución, y cómo pueden enmarcar y decidir sobre las distintas formas de internacionalización. Palabras clave: internacionalización de instituciones científicas; financiación a la internacionalización de la educación; institutos de educación superior brasileñas; cooperación Sur-Sur para la educación; cooperación Norte-Sur para la educación. JEL: O38; I28.
... This study focussed on a group of elite researchers at the NWU. (Martín-Martín et al., 2016). This makes RG reads easier to accumulate compared to Mendeley readership where a researcher must have shown an intent to further consider the document. ...
Article
Purpose The purpose of this paper is to establish the research impact of the National Research Foundation (NRF)-rated researchers’ output at the North-West University (NWU), South Africa, from 2006 to 2017. Design/methodology/approach The study used bibliometrics and altmetrics methods to determine the production of research outputs and the impact of NWU’s NRF-rated researchers’ publications. Various tools including Google Scholar (GS), Web of Science (WoS), Scopus, ResearchGate (RG) and Mendeley were used to collect data. The citations in the three bibliographic databases were used as proxy for academic impact, while reads and readerships in RG and Mendeley were used to determine societal impact of the researchers. The Statistical Package for the Social Sciences (SPSS) was used to test the relationship between citations in the three bibliographic databases and reads and readerships in RG and Mendeley. Findings The main findings were that the majority of NWU’s NRF-rated researchers’ publications emanated from GS, followed by Scopus and then WoS. GS output also had more citations. There were 6,026 research outputs in RG which were read for 676,919 times and 5,850 in Mendeley with 142,621 readerships. Correlations between RG and all three bibliographic databases’ citations were scant. Strong relationships between the three bibliographic databases’ citations and Mendeley readerships were found. Practical implications Academic librarians who interact with researchers who would like to predict future academic impact of their documents can be advised to consider Mendeley readerships with some level of confidence compared to RG reads. These results point to the importance of constant self-evaluation by researchers to ensure that they have balanced profiles across the three main bibliographic databases that are also considered for ratings. These results point to the relevancy of GS to evaluate research beyond the academy. Social implications The fact that researchers are contributing research that seeks to improve the general welfare of the population (beyond the academy) is a positive sign as society look up to researchers and research to solve their socio-economic problems. Social media play an important role as they serve as indicators that indicators point to wider research impacts and wider access by many different groups of people including the members of society at large. They point to research that is accessible to not only researchers and those who have access to their research but also the society at large. Originality/value Although the practice of rating researchers is common in different research ecosystems, the researchers could not find any evidence of studies conducted using a combination of bibliometrics and altmetrics to asses rated researchers’ output. This study covers and compares social impact based on data obtained from two academic social media sites and three main bibliographic databases (GS, Scopus and WoS).
Article
Full-text available
Objective: This study aims to provide a comprehensive bibliometric analysis of global research on audit delays, with a focus on identifying trends and factors that influence audit timeliness. The study seeks to assess the implications of these delays on financial reporting efficiency and institutional transparency, aligning with the goals of the United Nations' Sustainable Development Goals (SDGs), particularly SDG 16: Peace, Justice, and Strong Institutions. Theoretical Framework: The research is grounded in the literature on audit timeliness and its impact on corporate governance, financial transparency, and accountability. The study also incorporates sustainability frameworks, particularly SDG 16, which emphasizes the need for transparent and accountable institutions to foster trust and contribute to sustainable development. Method: This study utilizes a bibliometric analysis to explore global trends in audit delay research from 2020 to 2023. Data were collected using Google Scholar and analyzed using VOSviewer software to map co-authorship, citation networks, and keyword co-occurrence. A total of 999 relevant publications were identified and analyzed. Results and Discussion: The analysis identified five key clusters of research related to audit delays: company characteristics, auditor tenure, regulatory frameworks, technological advancements, and industry-specific challenges. The findings highlight that larger companies, higher profitability, and complex audit processes often lead to delays. Regulatory requirements such as the Sarbanes-Oxley Act and local regulations in Indonesia were shown to lengthen audit times. Technological innovations such as AI and blockchain have the potential to reduce delays but face uneven adoption. Addressing these delays is crucial for achieving SDG 16 by improving transparency and strengthening institutional governance. Research Implications: The study underscores the need for tailored solutions to mitigate audit delays in different regions and industries. It calls for increased adoption of technological innovations, improved regulatory frameworks, and context-specific approaches to enhance audit timeliness and institutional accountability, contributing to the broader goals of SDG 16. Originality/Value: This research provides a novel bibliometric analysis of audit delay literature, highlighting the intersection of financial reporting timeliness with sustainability goals. It offers valuable insights into how addressing audit delays can support the creation of transparent and accountable institutions, which is crucial for achieving SDG 16.
Article
Full-text available
Objetivo: Investiga indicadores de citação e altmétricos em relação à países e idiomas. Metodologia: Trata-se de ume estudo descritivo que utiliza indicadores altmétricos e de citação para analisar correlações entre eles. Analisou-se dados de 1.473 publicações extraídas da base Scopus (2013-2022) que continham autores afiliados ao Brasil em coautoria com autores de outros países. Resultados: Verificou-se que as maiores correlações estão entre os indicadores de citação (Crossref/Scopus). A mídia com maior correlação com os indicadores de citação foi o Mendeley. Em geral, existe correlação semelhante entre os indicadores em termos de países, sendo uma das exceções os indicadores dos documentos brasileiros publicados em colaboração com a Colômbia. Em relação a língua, as correlações mais altas são dos indicadores das publicações em inglês, sendo uma das exceções as publicações bilíngues escritas em português e inglês, que apresentaram correlações maiores entre Scopus/Twitter e as publicações em espanhol que apresentam uma correlação maior entre Mendeley/Twitter. Conclusões: Em geral, nota-se um padrão nas correlações entre citações e indicadores altmétricos em razão dos países que publicaram em colaboração internacional com o Brasil. Destaca-se a alta correlação entre as bases Scopus e Crossref, o que indica que as fontes são, em grande medida, compatíveis e que apresentam potencial avaliativo semelhante.
Article
Full-text available
This study examines the scientific studies on Air Cargo via bibliometric analysis for two decades. The aim of study is determining the intellectual structure of the air cargo focused studies, reveal its evolutionary development in the area and identify the research gaps or topics for further studies may be needed. Dimensions database was used to obtain bibliometric data. “Air Cargo” OR “Air Freight” AND “Supply Chain Management” keyword combination was selected to define the primary focused area. “Commerce, Management and Tourism” or “Transportation and Freight Services” or “Business and Management” sub-categories defined as research content. 492 journal articles were analysed with bibliometric techniques and WosViewer software used for visualization. Analysis indicates that most of citied and majority of published articles had found place on interdisciplinary researches. Studies which are linked to production, management and international supply chain operations got more attention compared to focused solely on air cargo ones. Besides, USA, UK and China were determined as most productive countries and Cranfield University was ranked first among institutions. According to another result, two main clusters were also observed in collaborations between universities/institutions as Eastern and Western settled. International Journal of Physical Distribution and Logistics Management was determined the leader of journals. Only journal which focused solely aviation research is Journal of Air Transport Management. With rising trend and new developments in both aviation and supply chain management are foreseen that researches in the area will sustain its rising trend in the future. Especially environmental, e-trade, labour and other interdisciplinary issues will need to be examined deeply.
Article
Purposestudy the bibliometrics indicators of Iranian authors in the field of obesity and Correlation Comparison between conventional citation counts and altmetrics scores from 2005 to 2019.Methods The study uses bibliometric characteristics and altmetric analysis. Population consists of 5359 articles out of 8220 in the field of obesity which have specified Iran as the affiliated country and indexed by Scopus between 2005 to 2019. Citations was extracted from Scopus database and visualized bibliographic data by VOS viewer software version 17, as well as Altmetric Explorer was applied for altmetrics data. The spearman correlation was used to analyze distributions of altmetrics and citation. Statistical analysis was utilized using SPSS software version 17.ResultsAccording to altmetrics finding among 2221 articles, 90% of articles had focused on different social media. The major interaction of researches has taken place in Twitter respectively News and Facebook. a positive correlation (r = 0.31) has been found between citation and altmetrics. As a result, Institutes with the highest degree of co-authorship had the top 10 articles with the highest altmetrics score.Conclusion Depending on the degree of correlation suggest that altmetrics should be seen as complements to, rather than alternatives to citations. Altmetrics indicators will be very useful for health policymaking and aid them with identifying important factors driving altmetric events. Also it could help to reveal the hidden value of some medical papers. Our findings can help international communications for scientific collaboration at the level of business and health care industry, and emergency managers gain a comprehensive understanding of the research area.
Article
Academic Social Networking Sites (ASNSs) are increasingly used by scholars to create academic profiles, share research publications, and interact with peers, among many other functions. However, it remains unclear how factors such as discipline may affect the scholarly use of such sites. This study chose ResearchGate (RG) as the sample site and gathered data from a total of 77,902 users from 61 U.S. research universities at different research activity levels as defined by the Carnegie Classification of Institutions of Higher Education. The sample users were categorized into six groups based on their affiliated departmental discipline as stated on their RG profiles. The results show that user participation and RG use characteristics vary by discipline. In addition, users from higher research activity level universities tend to show better performance in RG metrics than their lower research activity level counterparts regardless of discipline. The findings of this study help ongoing efforts to better understand the use of ASNSs among researchers, assist researchers in effectively connecting and interacting with peers in their respective disciplines, and contribute to the discussion of ASNSs in the context of scholarly communication, altmetrics, and information behavior in research communities and higher education institutions.
Article
Purpose This paper aims to answer the question of how the Polish representatives of social communication and media sciences communicate the most recent scientific findings in the media space, i.e. what types of publications are shared, what activities do they exemplify (sharing information about their own publications, leading discussions, formulating opinions), what is the form of the scientific communication created by them (publication of reference lists' descriptions, full papers, preprints and post prints) and what is the audience reception (number of downloads, displays, comments). Design/methodology/approach The authors present the results of analysis conducted on the presence of the most recent (2017–2019) publications by the Polish representatives of the widely understood social communication and media sciences in three selected social networking services for scientists: ResearchGate, Google Scholar and Academia.edu. The analyses covered 100 selected representatives of the scientific environment (selected in interval sampling), assigned, according to the OECD classification “Field of Science”, in the “Ludzie nauki” (Men of Science) database to the “media and communication” discipline. Findings The conducted analyses prove a low usage level of the potential of three analysed services for scientists by the Polish representatives of social communication and media sciences. Although 60% of them feature profiles in at least one of the services, the rest are not present there at all. From the total of 113 identified scientists' profiles, as little as 65 feature publications from 2017 to 2019. Small number of alternative metrics established in them, implies, in turn, that if these metrics were to play an important role in evaluation of the value and influence of scientific publications, then this evaluation for the researched Polish representatives of social communication and media sciences would be unfavourable. Originality/value The small presence of the Polish representatives of the communication and media sciences in three analysed services shows that these services may be – for the time being – only support the processes of managing own scientific output. Maybe this quite a pessimistic image of scientists' activities in the analysed services is conditioned by a simple lack of the need to be present in electronic channels of scientific communication or the lack of trust to the analysed services, which, in turn, should be linked to their shortcomings and flaws. However, unequivocal confirmation of these hypotheses might be brought by explorations covering a larger group of scientists, and complemented with survey studies. Thus, this research may constitute merely a starting point for further explorations, including elaboration of good practices with respect to usage of social media by scientists.
Thesis
Research e-visibility in theory enables a researcher to establish and maintain a digital research portfolio utilising various research e-profiles on a number of research online communities and platforms. E-visibility embodies the online presence of the researcher and their research, researcher’s discoverability via research e-profiles and the accessibility of research output on online research communities. The rationale for this study has its foundation in the premise that enhancing the e-visibility of a researcher will increase the research and societal impact of the researcher. The development of an e-visibility strategy for the School of Environmental Sciences (SES) at the University of South Africa (Unisa) would be instrumental in enhancing the e-visibility of the researchers. This study aims at establishing guidelines for the development of an e-visibility strategy for SES researchers at Unisa as part of research support via the Library services. Altmetric and bibliometric data of the SES researchers, were collected during the 2-year period (December 2014 and December 2017) and e-visibility surveys were conducted at the beginning of the study (December 2014) and at the end of the study (April 2017) as part of a longitudinal e-visibility study. The data was analysed using statistical methods to ascertain: 1) the SES researchers e-visibility status, 2) the SES researchers’ perceptions about e-visibility, 3) the altmetric-bibliometric correlations (relationships) from the altmetrics sourced from the academic social networking tools and the bibliometrics derived from the citation resources, and 4) identifying e-visibility practices and actions increasing research and societal impact. The results reflected a total increase in online presence, discoverability, and accessibility therefore indicating an overall increase in the actual and perceived e-visibility of the SES researchers. The survey conducted at the end of the study, found that 73% of the SES researchers indicating that their e-visibility increased with online presence being enhanced, 69% were more discoverable and 76% of their research output was more accessible after applying what they learnt during the e-visibility awareness and training. In addition, the study determined the altmetric-bibliometric correlations (Spearman) for the altmetrics and the bibliometrics from Web of Science, Scopus and Google Scholar, indicating strong positive correlations for ResearchGate, medium correlations for Academia.edu, positive correlations for Mendeley and the Unisa institutional repository. The deduction can be made that the altmetrics from ResearchGate, Academia.edu, Mendeley and the Unisa Institutional Repository have a positive relationship and suggests a positive influence on the bibliometrics of Web of Science, Scopus and Google Scholar. This represents a significant contribution with empirical evidence of altmetric-bibliometric correlations (relationships) for researchers in the environmental Sciences in a South African context. The investigation into the performance of the SES researchers regarding altmetricbibliometric distributions, trends and performance, showed increases in citation averages and h-indexes. These results were employed to identify e-visibility practices and actions which contributed to the enhancement of the e-visibility of the top performing SES researchers. The contribution of the study lies in emphasising the link between increased e-visibility contributing to increase research and societal impact. Based on the premise that increased e-visibility has an influence on citation performances and contributes to increasing research and societal impact, the results were employed to suggest guidelines for the development of an e-visibility strategy for academic librarians to aid academic librarians to adapting and embracing the changing roles of the academic librarians in developing an e-visibility strategy to enhancing e-visibility, incorporating inclusive research metrics as part of research support for researchers for research performance and research evaluation exercises at academic institutions. This study encourages researchers to have a comprehensive online presence to increase the researcher discoverability by utilising research e-profiles on research communities, and linking and uploading their research output to these e-profiles. Academic librarians are encouraged to embrace and adapt to the emerging niche roles on research teams at academic institutions. In addition, academic librarians are encouraged to take the initiative to gain knowledge and understand research metrics as part of research support to researchers, and initiate action plans and finding strategic partners in developing and implementing of an e-visibility strategy for researchers at the institution as part of research support to enhance e-visibility.
Book
Full-text available
Este libro aborda los procesos y resultados concernientes al aseguramiento de la calidad de la Educación Superior de Chile ocurridos en los últimos treinta años, considerando algunos aspectos del contexto latinoamericano. Los recientes cambios en la regulación de instituciones y programas de pregrado y postgrado en la educación superior chilena (Ley N.° 21.091 de Educación Superior de 2018) y su posterior implementación, permiten establecer un límite temporal para iniciar un análisis retrospectivo de los contextos y desafíos que la calidad ha impuesto a las tareas académicas. El trabajo se centra preferentemente en las universidades, sin que por ello se deje de mencionar lo ocurrido a nivel de sistema considerando a todas las instituciones de educación terciaria.
Article
Full-text available
SE PRESENTA UN MARCO TEÓRICO CONCEPTUAL SOBRE LOS ASPECTOS METODOLÓGICOS Y LAS RELACIONES INTERDISCIPLINARIAS DE LOS ESTUDIOS MÉTRICOS DE LA INFORMACIÓN, A TRAVÉS DE LAS DISCIPLINAS QUE CONDICIONAN EL SURGIMIENTO DE LAS DENOMINADAS ESPECIALIDADES MÉTRICAS DE LA INFORMACIÓN Y DE LA PROPUE TA DE UN ESQUEMA EN EL QUE SE RELACIONAN AQUELLOS ELEMENTOS QUE, DESDE EL PUNTO DE VISTA METODOLÓGICO, PUEDEN SER CONSIDERADOS EN ESTE TIPO DE ESTUDIO, INCLUYENDO AQUELLOS RELACIONADOS CON EL ANÁLISIS DE SUS VARIABLES.
Article
Full-text available
Structural changes to the scholarly environment are taking place as a result of the introduction of Web 2.0 technologies, which have given rise to Open Science 2.0 initiatives, such as open access publishing, open data, citizen science, and open peer evaluation systems. In turn, this is leading to new ways of building, showcasing, and measuring scholarly reputation through emerging platforms, such as ResearchGate. The article reports the findings of a survey of the opinions and practices of 251 European scholars about this emerging scholarly market. Findings showed that traditional research-related activities, including conducting and collaborating in research, taking part in multidisciplinary projects, and publishing in journals contribute most to scholarly reputation. The greatest weaknesses of reputational platforms were a lack of trustworthiness and being open to gaming. The large majority of researchers, despite some reservations, thought that reputational systems were here to stay and will become increasingly important in the future, and especially for younger researchers.
Article
Full-text available
The use of academic profiling sites is becoming more common, and emerging technologies boost researchers' visibility and exchange of ideas. In our study we compared profiles at five different profiling sites. These five sites are ResearchGate, Academia.edu, Google Scholar Citations, ResearcherID and ORCID. The data set is enriched by demographic information including age, gender, position and affiliation, which are provided by the national CRIS-system in Norway. We find that approximately 37% of researchers at the University of Bergen have at least one profile, the prevalence being highest (> 40%) for members at the Faculty of Psychology and the Faculty of Social Sciences. Across all disciplines, ResearchGate is the most widely used platform. However, within Faculty of Humanities, Academia.edu is the preferred one. Researchers are reluctant to maintain multiple profiles, and there is little overlap between different services. Age turns out to be a poor indicator for presence in the investigated profiling sites, women are underrepresented and professors together with PhD students are the most likely profile holders. We next investigated the correlation between bibliometric measures, such as publications and citations, and user activities, such as downloads and followers. We find different bibliometric indicators to correlate strongly within individual platforms and across platforms. There is however less agreement between the traditional bibliometric and social activity indicators.
Article
Full-text available
Las redes sociales digitales científicas (RSDC) suponen un nuevo punto de encuentro para los investigadores y un instrumento de primer orden para la promoción del conocimiento. Actualmente su uso se está incrementando entre la científica, ya que son plataformas ágiles para difundir los resultados de las investigaciones, así como para compartir conocimiento. Este trabajo estudia la presencia de las universidades españolas en las dos principales (ResearchGate y Academia.edu), a partir del análisis del personal académico adscrito a 77 universidades españolas, públicas y privadas. Los resultados muestran que, a pesar del desarrollo que están experimentando en todo el mundo, el número de miembros españoles en estas redes sociales es insuficiente y en algunas universidades casi anecdótico. Creemos que estos resultados deberían funcionar como la base para un programa de acción de las universidades que consideran que una parte de su misión o de sus valores son la difusión del conocimiento.
Article
Full-text available
The emergence of academic search engines (mainly Google Scholar and Microsoft Academic Search) that aspire to index the entirety of current academic knowledge has revived and increased interest in the size of the academic web. The main objective of this paper is to propose various methods to estimate the current size (number of indexed documents) of Google Scholar (May 2014) and to determine its validity, precision and reliability. To do this, we present, apply and discuss three empirical methods: an external estimate based on empirical studies of Google Scholar coverage, and two internal estimate methods based on direct, empty and absurd queries, respectively. The results, despite providing disparate values, place the estimated size of Google Scholar at around 160-165 million documents. However, all the methods show considerable limitations and uncertainties due to inconsistencies in the Google Scholar search functionalities.
Article
Full-text available
When evaluating the research output of scientists, institutions or journals, different portfolios of publications are usually compared with each other. e.g., a typical problem is to select, between two scientists of interest, the one with the most cited portfolio. The total number of received citations is a very popular indicator, generally obtained by bibliometric databases. However, databases are not free from errors, which may affect the result of evaluations and comparisons; among these errors, one of the most significant is that of omitted citations. This paper presents a methodology for the pair-wise comparison of publication portfolios, which takes into account the database quality regarding omitted citations. In particular, it is defined a test for establishing if a citation count is (or not) significantly higher than one other. A statistical model for estimating the type-I error related to this test is also developed.
Article
This study presents an extensive domain analysis of a discipline - information science - in terms of its authors. Names of those most frequently cited in 12 key journals from 1972 through 1995 were retrieved from Social Scisearch via DIALOG. The top 120 were submitted to author co-citation analyses, yielding automatic classifications relevant to histories of the field. Tables and graphics reveal: (1) The disciplinary and institutional affiliations of contributors to information science; (2) the specialty structure of the discipline over 24 years; (3) authors' memberships in 1 or more specialties; (4) inertia and change in authors' positions on 2-dimensional subject maps over 3 8-year subperiods, 1972-1979, 1980-1987, 1988-1995; (5) the 2 major subdisciplines of information science and their evolving memberships; (6) "canonical" authors who are in the top 100 in all three subperiods; (7) changes in authors' eminence and influence over the subperiods, as shown by mean co-citation counts; (8) authors with marked changes in their mapped positions over the subperiods; (9) the axes on which authors are mapped, with interpretations; (10) evidence of a paradigm shift in information science in the 1980s; and (11) evidence on the general nature and state of integration of information science. Statistical routines include ALSCAL, INDSCAL, factor analysis, and cluster analysis with SPSS; maps and other graphics were made with DeltaGraph. Theory and methodology are sufficiently detailed to be usable by other researchers.
Article
Bibliometrics emerged as a distinguishable study in 1969 at a time when, in the U.K. (as elsewhere too) university librarians were forced to abandon any 'Alexandrian' aims to be wholly selfsufficient. Close interaction with the new British Library and its Lending Division called for techniques of selection and integration that had hitherto not been seriously needed. The concurrent applications to library work of computers and developments in telecommunications helped to speed up these basic processes but also brought new problems and an abundance of data which invited exploration. Recent developments in Statistics - notably the analysis and clarification of the 'long-tailed Zipfian' distributions - seem to suggest that bibliometrics is wholly reducible to applied statistics. But critical discussion of one of these frequency distributions - Sichel's Inverse-GaussianlPoisson - sueeests that there remain some asDects of bibliometrics biyond the r e a c t of techniques dependent on the'analysis of frequency distributions. There therefore remains a theoretical gap yet to be bridged