ArticlePDF Available

Science Mapping: A Systematic Review of the Literature

Authors:
  • College of Computing and Informatics, Drexel University

Abstract and Figures

Purpose: We present a systematic review of the literature concerning major aspects of science mapping to serve two primary purposes: First, to demonstrate the use of a science mapping approach to perform the review so that researchers may apply the procedure to the review of a scientific domain of their own interest, and second, to identify major areas of research activities concerning science mapping, intellectual milestones in the development of key specialties, evolutionary stages of major specialties involved, and the dynamics of transitions from one specialty to another. http://www.jdis.org/10.1515/jdis-2017-0006
Content may be subject to copyright.
1
Expert Review
JDIS
Journal of Data and
Information Science
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
Science Mapping:
A Systematic Review of the Literature
Chaomei Chen
College of Computing and Informatics, Drexel University,
Philadelphia, PA 19104-2875, USA
Dr. Chaomei Chen is a Professor of Informatics in the
College of Computing and Informatics at Drexel University,
USA. He received a B.Sc. in Mathematics (Nankai
University, China), an M.Sc. in Computation (University of
Oxford, UK) and a Ph.D. in Computer Science (University
of Liverpool, UK). He served as a Visiting Professor at
Brunel University, UK and a Chang Jiang Scholar at Dalian
University of Technology, China. He served as a member
of Clarivate Analytics’s (formerly Thomson Reuter) Strategic
Advisory Board, the Research Portfolio Analysis
Subcommittee of the CISE/SBE Advisory Committee of the
National Science Foundation of the USA, a reviewer of the
Chang Jiang Scholars Program of the Chinese Ministry of
Education, and expert reviewers for national funding agencies of countries such as Austria,
Canada, Ireland, the Netherlands as well as the USA. Dr. Chen is the founding editor and the
Editor-in-Chief of the journal Information Visualization, the founding editor and the Specialty
Chief Editor of Frontiers in Research Metrics and Analytics, and serves on the editorial board
of Journal of Data and Information Science. His research and scholarly expertise is in the
visual analytic reasoning and assessment of critical information in complex adaptive systems.
His work focuses on identifying emerging trends and potentially transformative changes in the
development of science and technology, especially through computational and visual analytic
approaches. He is the author of The Fitness of Information: Quantitative Assessments of
Critical Information (Wiley, 2014), Turning Points: The Nature of Creativity (Springer, 2011),
Information Visualization: Beyond the Horizon (Springer 2004, 2006) and Mapping Scientific
Frontiers: The Quest for Knowledge Visualization (Springer 2003, 2013). Dr. Chen has
published over 200 peer-reviewed articles in multiple disciplines, including computer science
and information science. His work has been cited over 12,000 times on Google Scholar. His
research has been supported by the National Science Foundation (NSF) and other government
agencies as well as industrial sponsors such as Elsevier, IMS Health, Lockheed Martin, and
Pfizer. His earlier research was funded by the European Commission, the Engineering and
Physical Sciences Research Council (UK), and the Library and Information Commission (UK).
Dr. Chen has designed and developed the widely used visual analytics software CiteSpace
for visualizing and analyzing structural and temporal patterns in scientific literature.
Citation: Chaomei Chen
(2017). Science Mapping:
A Systematic Review of
the Literature.
Corresponding author: Chaomei Chen (E-mail: chaomei.chen@drexel.edu).
Vol. 2 No. 2, 2017
pp 1–40
DOI: 10.1515/jdis-2017-0006
Received: Jan. 30, 2017
Revised: Feb. 6, 2017
Accepted: Feb. 20, 2017
Journal of Data and Information Science Vol. 2 No. 2, 2017
2
Expert Review
Journal of Data and
Information Science
Abstract
Purpose: We present a systematic review of the literature concerning major aspects of science
mapping to serve two primary purposes: First, to demonstrate the use of a science mapping
approach to perform the review so that researchers may apply the procedure to the review of
a scientific domain of their own interest, and second, to identify major areas of research
activities concerning science mapping, intellectual milestones in the development of key
specialties, evolutionary stages of major specialties involved, and the dynamics of transitions
from one specialty to another.
Design/methdology/approach: We first introduce a theoretical framework of the evolution
of a scientific specialty. Then we demonstrate a generic search strategy that can be used to
construct a representative dataset of bibliographic records of a domain of research. Next,
progressively synthesized co-citation networks are constructed and visualized to aid visual
analytic studies of the domain’s structural and dynamic patterns and trends. Finally, trajectories
of citations made by particular types of authors and articles are presented to illustrate the
predictive potential of the analytic approach.
Findings: The evolution of the science mapping research involves the development of a
number of interrelated specialties. Four major specialties are discussed in detail in terms of
four evolutionary stages: conceptualization, tool construction, application, and codification.
Underlying connections between major specialties are also explored. The predictive analysis
demonstrates citations trajectories of potentially transformative contributions.
Research limitations: The systematic review is primarily guided by citation patterns in the
dataset retreived from the literature. The scope of the data is limited by the source of the
retrieval, i.e. the Web of Science, and the composite query used. An iterative query refinement
is possible if one would like to improve the data quality, although the current approach serves
our purpose adequately. More in-depth analyses of each specialty would be more revealing
by incorporating additional methods such as citation context analysis and studies of other
aspects of scholarly publications.
Practical implications: The underlying analytic process of science mapping serves many
practical needs, notably bibliometric mapping, knowledge domain visualization, and
visualization of scientific literature. In order to master such a complex process of science
mapping, researchers often need to develop a diverse set of skills and knowledge that may
span multiple disciplines. The approach demonstrated in this article provides a generic method
for conducting a systematic review.
Originality/value: Incorporating the evolutionary stages of a specialty into the visual analytic
study of a research domain is innovative. It provides a systematic methodology for researchers
to achieve a good understanding of how scientific fields evolve, to recognize potentially
insightful patterns from visually encoded signs, and to synthesize various information so as
to capture the state of the art of the domain.
Keywords Science mapping; Knowledge domain visualization; Domain analysis;
Systematic review; CiteSpace
3
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
1 Introduction
Science mapping is a generic process of domain analysis and visualization. The
scope of a science mapping study can be a scientific discipline, a field of research,
or topic areas concerning specific research questions. In other words, the unit of
analysis in science mapping is a domain of scientific knowledge that is reflected
through an aggregated collection of intellectual contributions from members of a
scientific community or more precisely defined specialties. The scope of the domain
should contain major components that are relevant to the underlying research
program. The structure of the domain may experience constant and possibly
revolutionary changes.
A science mapping study typically consists of several components, notably a body
of scientific literature, a set of scientometric and visual analytic tools, metrics and
indicators that can highlight potentially significant patterns and trends, and theories
of scientific change that can guide the exploration and interpretation of visualized
intellectual structures and dynamic patterns.
Commonly used sources of scientific literature include the Web of Science,
Scopus, Google Scholar, and PubMed. Scientometric methods include author
co-citation analysis (ACA) (Chen, 1999a; White & McCain, 1998), document
co-citation analysis (DCA) (Chen, 2006; Small, 1973), co-word analysis (Callon
et al., 1983), and many other variations. Visualization techniques include graph or
network visualization (Herman, Melançon, & Marshall, 2000), visualizations of
hierarchies or trees (Johnson & Shneiderman, 1991), visualizations of temporaral
structures (Morris et al., 2003), geospatial visualizations, and coordinated views of
multiple types of visualizations. Metrics and indicators of research impact include
citation counts (Garfield, 1955), the h-index (Hirsch, 2005) and its numerous
extensions, and a rich set of altmetrics on social media (Thelwall et al., 2013).
Theories of scientific change include the paradigmatic views of scientific revolutions,
scientific advances driven by competitions, and evolutionary stages of a scientific
discipline. In order to conduct a science mapping study, researchers need to develop
a good understanding of each of the categories of skills and knowledge outlined
above. Furthermore, each of these categories is a current and active research area
in its own right, for instance, the current research on finding the optimal field
normalization method and the debates over how various potentially conflicting
theories of scientific change may be utilized to reveal the underlying mechanisms
of how science advances.
The complexity of science mapping is shared by many research fields. In this
article, our aim is to demonstrate the process of a systematic review based on a
series of visual analytic functions implemented in the continously evolving software
Journal of Data and Information Science Vol. 2 No. 2, 2017
4
Expert Review
Journal of Data and
Information Science
CiteSpace (Chen, 2004; Chen, 2006; Chen, Ibekwe-SanJuan, & Hou, 2010). We
demonstrate the steps of preparing a representative dataset, how to generate
visualizations that can guide our review, and how to identify salient patterns at
various levels of granularity. We also aim to set an example of a systematic review
that can address questions that are commonly asked by researchers when they need
to grasp the state of the art of a fast growing and complex scientific domain.
Since the origin and major milestones of the science mapping research will
become clear as we visualize and explain our results, we will first describe the
methodology and then present the results with our interpretations and reflections as
the systematic review.
2 Theories of Scientific Change
The widely known theory of scientific change is Thomas Kuhn’s scientific
revolutions (Kuhn, 1962). According to Kuhn’s structure of scientific revolutions,
science advances as an iterative revolutionary process, in which scientific paradigms
compete for a predominant position, i.e. paradigm shifts. The process consists of
several stages: normal science, crises, and revolutions. At the normal science stage,
research in a field is primarily dominated by a particular scientific paradigm,
consisting of a set of theories, methods, and a consensus of a research agenda. At
the crisis stage, anomalies become inevitable and they challenge the fundation of
the current paradigm. Alternative and competing paradigms are developed to address
the anomalies. At the revolutionary stage, compeling evidence is accumulated and
competing paradigms become mature enough to take over the existing paradigm
that has been evidently incapable of handling the pressing crises. A new paradigm
replaces the existing paradigm and provides the overarching framework for the
research community. This process repeats itself as the new paradigm becomes
the norm.
A sociological theory of scientific change proposed by Fuchs (1993) challenges
the Kuhnian paradigm shift model as being oversimplified the complex reality.
Instead, Fuchs proposes that advances of science are driven by sociological reasons,
i.e. by scientists who are competing for recognition and reputation in their
organizational settings. Fuchs suggests that interactions of two variables may derive
four types of scientific change. The two variables are task uncertainty and mutual
dependence. Task uncertainty refers to the level of uncertainty involved in the
course of scientific inquiry. The task uncertainty is high in scientific frontiers where
research is essentially exploratory in nature and there is a high amount of tacit
knowledge involved, for example, scientific discoveries of high creativity. In
contrast, the task uncertainty is low in areas where tasks are rountinized. Mutual
dependence refers to the social and organizational dependencies between scientists
5
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
and their competing peers. A combination of high task uncertainty and high mutual
dependence will lead to original scientific discoveries, which will bring a substaintial
degree of recognitions and reputations such as Nobel Prizes. A research area with
intensified competitions is also likely to have a high retraction rate (Chen et al.,
2013). A combination of low task uncertainty and high mutual dependence will
result in specialization to maintain the tension between scientists with high mutual
dependence while they work on routinized research.
A relatively new theory of the evolution of a scientific discipline is proposed
by Shneider (2009). The evolution of a scientific discipline is divided into four
stages. A new discipline, or more generically, a research specialty, begins with a
conceptualization stage – Stage I. At this stage, the object of the research is
established, for example, as in science mapping, the object of science mapping is a
scientific knowledge domain. The goal of the new research is to answer a set of
research questions concerning the newly identified target. The next stage, Stage II,
is characterized by the development of research instruments, or tools, that will
enable researchers to investigate the underlying phenomona. Once researchers are
equppied with special purpose tools, the research moves to the third stage, Stage
III, the investigation of the research questions supported by the newly developed
enabling techniques. This is a prolific stage as many results are produced and the
understanding of the research problems is substantially advanced. The deepened
understanding and a thorough examination may reveal previously unknown
phenomena. The need to address the previously unknown phenomena may lead to
the emergence of another new line of research or a new research specialty. The
original specialty may continue to investigate along the original research agenda. In
addition, tools developed by the original specialty may find their way to contribute
to the development of other subject domains. In other words, a Stage III specialty
may contribute to a specialty in its own Stage II. The final stage of a specialty is
Stage IV. The specialty at this stage is characterized by transferring tacit knowledge
to condified and rountinized knowledge. Comprehensive textbooks are written.
Accumulated domain knowledge is reviewed, synthesized, and conveyed to new
comers to the specialty as well as the existing members of the specialty.
There are other theories of scientific change. For example, a transition model of
Exploration Unification Decline/Displacement was proposed by Mulkay,
Gilbert, and Woolgar (1975). Nevertheless, the three outlined above are represenative
and they sufficiently cover the major characteristics of the development of a
scientific field for the purpose of our systematic review.
We can see obvious overlaps as well as clear distinctions among them. For
example, Shneider’s Stage III overlaps with Kuhn’s crisis stage. Fuchs’ high task
uncertainty and high mutual dependence may characterize a Stage I specialty. As a
Journal of Data and Information Science Vol. 2 No. 2, 2017
6
Expert Review
Journal of Data and
Information Science
specialty transits from Stage III to Stage IV, it may transform itself from a high-high
uncertainty-dependence mode to a low-high mode. A specialty may start to decay
and it may be forgotten, but it may be revitalized years later, as a sleeping beauty
in the conceptual world (van Raan, 2003).
In addition to the four evoluationary stages, Shneider also proposes that each
stage may suit a particular type of scientists better than other stage in terms of how
a type of talent matches a particular stage. In fact, Shneider suggests that the better
scientists understand the four stages of development the more effectively they may
optimize their career path. In addition, Shneider emphasizes that researchers at
different stages may not have the mindset that would enable them to evaluate a
discipline at a different evolutionary stage.
In corresponding to the four stages, we may identify the most suitable scientists
for each stage as creative thinkers, visioners, boundary spanners, or brokers for
scientists who may excel at Stage I; inventors and tool builders for Stage II; adaptors
and experimenters for Stage III; and sythesizers, codifiers, and educators for Stage
IV. In this review, we will adopt Shneider’s four stage model in our interpretation
of the results.
3 Method
3.1 Data Collection
The input data of our review is generated by a combination of the results from
multiple topic search queries to the Web of Science (Figure 1). The rationale of the
query construction is as follows. First, we would like to ensure that currently widely
used science mapping tools such as VOSViewer, CiteSpace, HistCite, SciMAT, and
Sci2 are covered by our topic search query. Thus, publications that mention any of
these software tools in their titles, abstracts, and/or keyword lists will be included.
This query generates 135 records as Set #1 (Figure 1).
Second, since the goal of science mapping is to identify the intellectual structure
of a scientific domain, the second query focuses on the object of science mapping,
including topic terms such as intellectual structure, scientific change, research front,
invisible college, and domain analysis. As we will see later on, terms such as domain
analysis may be ambiguous as they are also used in other contexts that are irrelevant
to science mapping. In practice, we recommend to defer the assessment of relevance
until the analysis stage. This query produces 13,242 records as Set #2 (Figure 1).
The third query focuses on scientometric and visual analytic techniques that are
potentially relevant to science mapping. Topic terms include science mapping,
knowledge domain visualization, information visualization, citation analysis, and
co-citation analysis. This query leads to 4,772 records (Figure 1).
7
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
The queries #4–#10 aim to retrieve bibliographic records on the common data
sources for science mapping, including Scopus (6,782 records), the Web of Science
(15,401 records), Google Scholar (5,170 records), Pubmed (46,760 records), and
Medline (61,405 records) (Figure 1).
The final dataset is Set #14, containing 17,731 bibliographic records of the types
of Article or Review in English (Figure 1). This query formation strategy is generic
enough to be applicable to a science mapping study unless of course one has access
to the entire database.
Patents and research grants are other types of data sources one may consider, but
for this particular review, we are limited to the scientific literature indexed by the
Web of Science.
Figure 1. Topic search queries used for data collection.
3.2 Visualization and Analysis
We visualize and analyze the dataset with a new version of CiteSpace (5.0.R3
SE) (Figure 2). CiteSpace has been continuously developed to meet the needs for
visual analytic tasks of science mapping. The new version will be released shortly
to the public.
Journal of Data and Information Science Vol. 2 No. 2, 2017
8
Expert Review
Journal of Data and
Information Science
Figure 2. The main user interface of CiteSpace.
CiteSpace takes a set of bibliographic records as its input and models the
intellectual structure of the underlying domain in terms of a synthesized network
based on a time series of networks derived from each year’s publications. CiteSpace
supports several types of bibliometric studies, including collaboration network
analysis, co-word analysis, author co-citation analysis, document co-citation
analysis, and text and geospatial visualizations. In this study, we focus on the
document co-citation analysis within the period of time between 1995 and 2016.
The Set #14 contains 16,250 records published in the range of 1980–2017
(Figure 3). These records collectively cited 515,026 references. The document
co-citation analysis function in CiteSpace constructs networks of cited references.
Connections between references represent co-citation strengths. CiteSpace uses a
time slicing technique to build a time series of network models over time and
synthesizes these individual networks to form an overview network for the systematic
review of the relevant literature.
The synthesized network is divided into co-citation clusters of references. Citers
to these references are considered as the research fronts associated with these
clusters. Each cluster represents the intellectual base of the underlying specialty.
According to Shneider’s four stage model, the intellectual base of a specialty and
9
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
the corresponding research fronts provide valuable insights into the current stage of
the specialty as well as the intellectual milestones in the evolution of the specialty.
Our first step in the review is to make sense of the nature of major clusters and
characteristics that may inform us about the stage of the underlying specialties. In
this study, we consider a cluster as the embodiment of an underlying specialty. Thus,
science mapping consists of multiple specialties that contribute to various aspects
of the domain.
In each cluster, we focus on cluster members that are identified by structural and
temporal metrics of research impact and evolutionary significance. A commonly
used structural metric is the betweenness centrality of a node in a network. Studies
have shown that nodes with high betweenness centrality values tend to identify
boundary spanning potentials that may lead to transformative discoveries (Chen
et al., 2009). Burst detection is a computational technique that has been used to
identify abrupt changes of events and other types of information (Kleinberg, 2002).
In CiteSpace, the sigma score of a node is a combinant metric of the betweenness
centrality and the citation burstness of the node, i.e. the cited reference. CiteSpace
represents the strength of these metrics through the design of visual encoding such
that articles that are salient in terms of these metrics will be easy to see in the
visualizations. For example, the citation history of a node is depicted as a number
of treerings and each treering represents the number of citations received in the
corresponding year of publication. If a citation burst is detected for a cited reference,
the corresponding treering will be colored in red. Otherwise, treerings will be
colored by a spectrum that ranges from cold colors such as blue to warm colors
such as orange.
The nature of a cluster is identified from the following aspects: a hierarchy of
key terms in articles that cite the cluster (Tibély et al., 2013), the prominent members
of the cluster as the intellectual milestones in its evolution and as the intellectual
Figure 3. The distribution of the bibliographic records in Set #14.
Journal of Data and Information Science Vol. 2 No. 2, 2017
10
Expert Review
Journal of Data and
Information Science
base of the specialty, recurring themes in the citing articles to the cluster to reflect
the interrelationship between the intellectual base and the research fronts. In
particular, we will pay attention to indicators of the evolutionary stages of a specialty
such as the original conceptualization, research instruments, applications, and
routinization of the domain knowledge of the specialty.
In addition to the study of citation-based patterns, we will demonstrate the concept
of citation trajectories in the context of distinct clusters. According to the theory of
structural variation, the transformative potential of an article may be reflected
by the extent to which it varies the existing intellectual structure (Chen, 2012).
For example, if an article adds many inter-cluster links, it may alter the overall
structure. If the structural change is subsequently accepted and reinforced by other
researchers, then transformative changes of the knowledge become significant in a
socio-cognitive view of the domain.
4 Results
A dual-map overlay of the science mapping literature represents the entire dataset
in the context of a global map of science generated from ove 10,000 journals indexed
in the Web of Science (Chen & Leydesdorff, 2014). The dual-map overlay shows
that science mapping papers are published in almost all major disciplines
(Figure 4). Publications in the discipline of information science (shown in the map
as curves in cyan) are built on top of at least four disciplines on the right-hand side
of the map.
A hierarchical visualization of index terms, i.e. keywords, is generated to represent
the coverage of the dataset. Five semantic types of nodes are annotated in the
visualized hierarchy:
What: A fundamental phenomenon of a specialty and the object of a study, for
example, the intellectual structure or the dynamics of a research field
How: Methodologies, procedures, and processes of science mapping, for
example, author co-citation analysis, bibliometric mapping, and co-citation
analysis
Abstraction: Computational models of an underlying phenomenon identified
from the bibliographic data, representations such as Pathfinder networks,
metrics, and indicators such as the h-index and the g-index
Tools: Computational techniques, algorithms and software tools for visualization
and ranking scholarly publications
Data: Data sources used by science mapping studies, for example, Scopus and
Google Scholar
11
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
Figure 4. A dual-map overlay of the science mapping literature.
Journal of Data and Information Science Vol. 2 No. 2, 2017
12
Expert Review
Journal of Data and
Information Science
These semantic types will be also used to identify the evolutionary stage of a
specialty. For example, if a cluster contains several articles that report the
development of software tools, then the underlying specialty is considered as a
specialty that has reached at least Stage II. If the methdologies appear in a cluster
of knowledge domains external to information science, such as regenerative
medicine and strategic management research, then we will consider the specialty
has reached Stage III – tools developed by the specialty are applied to other subject
domains. In the following analysis, we will use the terms in the hierarchy as the
primary source of our vocabulary to identify the role of the contributions made by
a scientific publication to a specialty.
Major milestones in the development of science mapping can be identified from
the list of references that have strong citation bursts between 1995 and 2016.
References with strong values in the Stregth column tend to be significant milestones
for the science mapping research. We label such references with high-level concepts.
For example, the first milestone paper in the study is a landmark ACA study of
information science (White & McCain, 1998). The next milestone is a major
collection of seminal papers in information visualization (Card, Mackinlay, &
Shneiderman, 1999). Other major milestones include visual analytics (Thomas &
Cook, 2005), and the h-index (Hirsch, 2005).
4.1 Landscape View
The following landscape view is generated based on publications between 1995
and 2016 (Figure 5). Top 100 most cited publications in each year are used to
construct a network of references cited in that year. Then individual networks are
synthesized. The synthesized network contains 3,145 references. The network
contains 603 co-citation clusters. The three largest connected components include
1,729 nodes, which account for 54% of the entire network. The network has a
modularity of 0.8925, which is considered as very high, suggesting that the specialties
in science mapping are clearly defined in terms of co-citation clusters. The average
silhouette score of 0.3678 is relative low mainly because of the numerous small
clusters. The major clusters that we will focus on in the review are sufficiently high.
The areas of different colors indicate the time when co-citation links in those
areas appeared for the first time. Areas in blue were generated earlier than areas in
green. Areas in yellow were generated after the green areas and so on. Each cluster
can be labled by title terms, keywords, and abstract terms of citing articles to the
cluster. For example, the yellow-colored area at the upper right quadrant is labeled
as #3 information visualization, indicating that Cluster #3 is cited by articles on
information visualization. The largest node is the paper that introduces the h-index.
Other nodes with red treerings are references with citation bursts (Figure 6).
13
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
4.2 Timeline View
A timeline visualization in CiteSpace depicts clusters along horizontal timelines
(Figure 7). Each cluster is displayed from left to right. The legend of the publication
time is shown on top of the view. The clusters are arranged vertically in the
descending order of their size. The largest cluster is shown at the top of the view
(Figure 8). The colored curves represent co-citation links added in the year of the
corresponding color. Large-sized nodes or nodes with red treerings are of particular
interest because they are either highly cited or have citation bursts or both. Below
Figure 5. A hierarchy of indexing terms derived from Set #14.
Journal of Data and Information Science Vol. 2 No. 2, 2017
14
Expert Review
Journal of Data and
Information Science
Figure 6. 49 references with citation bursts of at least 5 years.
each timeline the three most cited references in a particular year are displayed. The
label of the most cited reference is placed at the lowest position. References
published in the same year are placed so that the less cited references are shifted to
the left. The new version of CiteSpace supports the function to generate labels of a
cluster year by year based on terms identified by Latent Semantic Indexing (LSI)
(Deerwester et al., 1990). The year-by-year labels can be displayed in a table or
above the corresponding timeline. Users may control the displays interactively.
15
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
Clusters are numbered from 0, i.e. Cluster #0 is the largest cluster and Cluster #1
is the second largest one. As shown in the timeline overview, the sustainability of
a specialty varies. Some clusters sustain a period over 20 years, whereas some
clusters are relatively short-lived. Some clusters remain active until the 2015, the
most recent year of publication for a cited reference in this study.
Each of the largest five clusters has over 150 members (Table 2). The largest
cluster’s homegeneity in terms of the silhouette score is slightly lower than that of
the smaller clusters. The largest cluster represents 4.5% of the references from the
entire network and 8.1% of the largest three connected components of the network
(LCCs). In this study, our review will primarily focus on the largest five clusters.
Figure 7. A landscape view of the co-citation network, generated by top 100 per slice between 1995 and 2016
(LRF = 3, LBY = 8, and e = 1.0).
Journal of Data and Information Science Vol. 2 No. 2, 2017
16
Expert Review
Journal of Data and
Information Science
Figure 8. A timeline visualization of the largest clusters of the total of 603 clusters.
17
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
The duration of a cluster is particularly interesting (Table 2). The largest cluster
lasts 21 years and it is still active. Cluster #3 spans a 19-year period and also remains
to be active. In contrast, Cluster #6 on webometrics ends by 2006, but as we will
see, relevant research finds its way in new specialties, notably in the form of
altmetrics.
4.3 Major Specialties
In the following discussion, we will particularly focus on the five largest clusters
(Table 1). A research programme, or a paradigm, in a field of research can be
characterized by its intellectual base and research fronts. The intellectual base is the
collection of scholarly works that have been cited by the corresponding research
community, whereas research fronts are the works that are inspired by the ones of
the intellectual base. A variety of research fronts may rise from a common intellectual
base.
Table 1. The five largest clusters of co-cited references of the network of 3,145 references. The largest three
connected components include 1,729 of the references.
Cluster Size Mean
(year) Silhouette % of the
network
Accumulated %
of network
% of top 3
LCCs
Accumulated %
of LCCs
0 214 2006 0.748 4.5 4.5 8.1 8.1
1 209 1997 0.765 2.3 6.7 4.1 12.2
2 190 2009 0.845 3.3 10.0 6.0 18.2
3 160 2005 0.954 2.9 12.9 5.3 23.5
4 152 1992 0.890 1.7 14.6 3.0 26.5
4.3.1 Cluster #0 – Science Mapping
Cluster #0 is the largest cluster, containing 214 references across a 21-year period
from 1995 till 2015 (Table 2). The median year of all references in this cluster is
2006, but the median year of the 20 most representative citing articles to this cluster
is 2010. This cluster’s silhouette value of 0.748 is the lowest among the major
clusters, but this is generally considered a relatively high level of homegeneity.
The primary focus of the large and currently active cluster is on the intellectual
structure of a scientific discipline, a field of research, or any sufficiently self-
contained domain of scientific inquiry. Key concepts identified from the titles
of citing articles to this cluster can be algorithmically organized according to
hierarchical relations derived from co-occurring concepts (Figure 9). The largest
branch of a such hierarchy typically reflects the core concepts of scholarly
publications produced by the specialty behind the cluster. For example, concepts
such as intellectual structure, co-citation analysis, and co-authorship network
underline the primary interest of this specialty.
Journal of Data and Information Science Vol. 2 No. 2, 2017
18
Expert Review
Journal of Data and
Information Science
Figure 9. A hierarchy of key concepts selected from citing articles of Cluster #0 by log-likelihood ratio test.
We can use a simple method to classify various terms into two broad categories:
domain-intrinsic or domain-extrinsic. Domain-intrinsic terms belong to the research
field that aims to advance the conceptual and methodological capabilities of science
mapping, for example, intellectual structure and co-citation analysis. In contrast,
domain-extrinsic terms belong to the domain to which science mapping techniques
are applied. In other words, they belong to the domain that is the object of a science
mapping study. For example, stem cell research per se may not directly influence
the advance of a specialty that is mainly concerned with how to identify the
intellectual structure of a research field from scientific literature. Information
science has a unique position. On the one hand, it is the discipline that hosts a
considerable number of fields relevant to science mapping. On the other hand, it is
the most frequent choice of a knowledge domain to test drive newly developed
techniques and methods.
The timeline visualization reveals three periods of its development (Figure 10).
The first period is from 1995 to 2002. This period is relatively uneventful without
high-profile references in terms of citation counts or bursts. Two visualization-
centric domain analysis articles (Boyack, Wylie, & Davidson, 2002; Chen et al.,
Table 2. Temporal properties of major clusters.
Cluster
ID Size Silhouette From To Dura-
tion Median Sustain-
ability
Active-
ness Theme
0 214 0.748 1995 2015 21 2006 ++++++ Active Science mapping
1 209 0.765 1990 2006 17 1997 ++ Inactive Domain analysis
2 190 0.845 2000 2015 16 2009 Active Research evaluation
3 160 0.954 1996 2014 19 2005 ++++ Active Information visualization /
Visual analytics
4 152 0.890 1988 1999 12 1993 Inactive Applications of ACA
6 125 0.925 1995 2006 12 2001 Inactive Webometrics
8 93 0.882 1994 2010 17 2002 ++ Inactive Bibliometric studies of social
work in health
11 48 0.965 1994 2006 13 2000 Inactive Bibliometric studies of
management research
12 44 0.966 1990 1999 10 1996 Inactive Graph visualization
16 29 0.977 1999 2007 9 2003 Inactive Bibliometric studies of
information systems
28 15 0.995 2004 2013 10 2008 Inactive Global trend; Water resources
19
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
2002), preluded the subsequent wave of high-impact studies appeared in the second
period. This period also features a social network analysis tool UCINET (Borgatti,
Everett, & Freeman, 2002).
The second period is from 2003 to 2010. Unlike the first period, the second period
is full of high impact contributions – large citation treerings and periods of citation
bursts colored in red. Several types of high impact contributions appeared in this
period, notably.
literature reviews – (Börner, Chen, & Boyack, 2003)
software tools – CiteSpace (Chen, 2004), CiteSpace II (Chen, 2006), CiteSpace
III (Chen et al., 2010), VOSViewer (van Eck & Waltman, 2010)
science mapping applications – visualization of information science (White,
2003), mapping the backbone of science (Boyack, Klavans, & Börner, 2005)
and a global map of science based on ISI subject categories (Leydesdorff &
Rafols, 2009)
metrics and indicators – a critique on the use of Pearson’s correlation coefficients
as co-citation similarities – a previsouly common practice in ACA studies
(Ahlgren, Jarneving, & Rousseau, 2003)
applications to other domains – a bibliometric study of strategic management
research (Ramos-rodriguez, 2004) and another ACA of strategic management
research (Nerur, 2008)
The third period is from 2010 to 2015. Although no citation bursts were detected
so far in this period, the themes of this period shed additional insights into the more
Figure 10. High impact members of Cluster #0.
Journal of Data and Information Science Vol. 2 No. 2, 2017
20
Expert Review
Journal of Data and
Information Science
recent developmental status of the specialty. Most cited publications in this period
include a study of the cognitive structure of library and information science
(Milojević et al., 2011) and a few studies that focus on domains with no apparent
overlaps with computer and information science, for example regenerative medicine
(Chen, Dubin, & Kim, 2014; Chen et al., 2012) and strategic management (Vogel
& Güttel, 2013).
A specialty may experience the initial conceptualization stage, the growth of
research capabilities through the fourish of research tools, the expansion stage when
researchers apply their methods to subject domains beyond the original research
problems, and the final stage of decay (Shneider, 2009). The largest cluster is
dominated by an overwhelming number of tool-related references. The top 20 most
cited members of the cluster (Figure 11) include several software tools such as
CiteSpace (Chen, 2006; Chen et al., 2010), UCINET (Borgatti et al., 2002),
VOSViewer (van Eck & Waltman, 2010), and global maps of science (Leydesdorff
& Rafols, 2009). In terms of the four-stage evolution model of Shneider, the
underlying specialty evidently reached Stage II – the tool building stage by 2010.
Figure 11. Top 20 most cited references in the largest cluster.
The cluster also includes several author co-citation studies of disciplines and
research areas such as information science (Chen et al., 2010; White, 2003) and
strategic management (Nerur, 2008; Ramos-rodriguez, 2004). White (2003) revisits
the intellectual structure of information science. Instead of using multidimensional
scaling technique as they did in a previous study of the domain, the new study
applied the Pathfinder network scaling technique and demonstrated the advantages
of the technique. Pathfinder network scaling was first introduced to author co-citation
analysis in (Chen, 1999b). Although the studies of strategic management research
can be seen as applications outside the original specialty of author co-citation
analysis, studies of information science typically involve the development of new
tools as well as applications of existing tools.
21
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
Articles that cited members of the cluster convey additional information for us
to understand the dynamics of the specialty. The top 20 citing articles (Figure 12)
ranked by the bibliographic overlap with the cluster reveal similar types of
contributions, namely software tools and techniques (1, 2, 5, 8, 14), new methods
(9, 11, 16, 19, 20), surveys and reviews (3, 10, 13), and applications of bibliometric
studies (6, 12, 17).
Figure 12. Major citing articles to the largest cluster.
The timeline visualization suggests that the specialty represented by the largest
cluster has cumulated sufficient research techniques and tools by the end of the third
period and that the specialty is likely in Stage III of its evolution. In other words,
the specialty is currently dominated by the need to apply these new techniques to a
broader range of scientific domains and address research questions at new levels.
According to Shneider’s four-stage model, Stage III is also the evolutionary stage
when the specialty may encounter discoveries that could change the course of the
development of the specialty.
At a more pragmatic level, one may monitor the further development of the
specialty by tracking research fronts that are built on the early stages of the specialty.
One can monitor emerging trends and patterns in terms of the major dimensions in
the latent semantic space spanned by each year’s publications connected to this
particular cluster. For example, the growing number of domain-extrinsic terms such
as nanotechnology, case study, and solar cell suggest an expansion of the research
scope – a hallmark of a Stage III specialty.
4.3.2 Cluster #1 – Domain Analysis
Cluster #1 is the second largest cluster, containing 209 references that range a
17-year duration from 1990 to 2006. The cluster, or its underlying specialty, is
Journal of Data and Information Science Vol. 2 No. 2, 2017
22
Expert Review
Journal of Data and
Information Science
largely inactive with reference to the resolution of this study. This cluster is
dominated by representative terms such as information retrieval, domain analysis,
scholarly communication, and intellectual space (Figure 13). Although information
retrieval is the root node in the hierarchy of key terms in this cluster, domain analysis
underlines the conceptual fundation of this cluster, as we will see shortly.
Figure 13. A hierarchy of key concepts in Cluster #1.
Two outstanding references from the timeline visualization of this cluster
(Figure 14) have strong citation burstness. One is a doman analysis of information
science (White & McCain, 1998), in which the multidimensional scaling of an
author co-citation space was utilized to visualize the intellectual structure of the
domain. The other is a study of major approaches to domain analysis (Hjørland,
2002). In early 1990s, Hjørland developed a domain-analytic apporach, also known
as socialogical-epistemological approach or a socio-cognitive view, as a
methdological alternative to the then methodological individualism and cognitive
perspective toward information science that largely marginalized the social,
historical, and cultural roles in understanding a domain of scientific knowledge.
Hjørland’s another article on domain analysis is also a member of the cluster
(Hjørland, 1997).
Figure 14. Key members of Cluster #1.
The sigma score of a cited reference reflects its structural and temporal signific ance.
In addition to the author co-citation analysis of information science (White &
McCain, 1998), two more author co-citation studies are ranked highly by their sigma
scores (Figure 15), namely an author co-citation study of information retrieval (Ding,
Chowdhury, & Foo, 1999), and an author co-citation study of hypertext (Chen,
1999b).
23
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
The review article by White and McCain (1997) on “visualization of literature”
is an important member of the cluster, whereas Tabah’s review of the study of
literature dynamics (Tabah, 1999) is a citing article to the cluster. Although the term
domain analysis was not used consistently during the period of this cluster, the
contributions consistently focus on holistic views of a knowledge domain. As
Hjørland argued, domain analysis serves a fundamental role in information science
because its goal is to understand the subject matter from a holistic view of
sociological, cognitive, historical, and epistemological dimensions.
Figure 15. Key members of Cluster #1, sorted by sigma.
Citing articles to Cluster #1 (Figure 16) include some of the earliest attempts to
integrate information visualization techniques to the methodology of a domain
analysis (Börner et al., 2003; Boyack, Wylie, & Davidson, 2002; Chen et al., 2002).
Interestingly, some of these citing articles appear as cited references in Cluster #0.
In other words, the downturn of Cluster #1 does not mean that researchers lost their
interest in the domain analysis approaches. Rather, they shifted their focus to explore
a new generation of domain analysis with the support of a variety of computational
and visualization techniques. As a result, the specialty underlining Cluster #0
continues the vision conceived in the works of Cluster #1. The citers of Cluster #1
identify the group of researchers who would be the core members of the specialty
of the new generation of domain analysis.
Author co-citation analysis (ACA) plays an instrumental role in the development
of the domain analysis specialty embodied in Cluster #1. It is not only a bibliometric
method that has been adopted by researchers beyond information science, but also
a research instrument that helps to reveal challenges that the next generation of
domain analysis must deal with.
In their ACA study of information science, White and McCain (1998) masterfully
demonstrated the power and the potential of what one may learn from a holistic
view of the intellectual landscape of a discipline. They utilized the multidimensional
Journal of Data and Information Science Vol. 2 No. 2, 2017
24
Expert Review
Journal of Data and
Information Science
scaling technique as a vehicle for visualization and tapped into their encyclopedic
knowledge of the information science discipline in an intellectually rich guided tour
across the literature. In an attempt to enrich and enhance the conventional
methodology of ACA, Chen (1999b) introduced the Pathfinder network scaling
technique. Using Pathfinder networks brings several advantages to the methodology
of ACA, including the ability to identify and preserve salient structural patterns and
algorithmically derived visual cues to assist the navigation and interpretation of
resultant visualizations (Chen & Morris, 2003). White (2003) revisited the ACA
study of information science with Pathfinder network scaling. A fast algorithm to
compute Pathfinder networks is published in 2008 (Quirin et al., 2008).
The re-introduction of the network thinking opens up a wider variety of
computational techniques to an ACA study, notably network modeling and
visualization. Furthermore, technical advances resulted from the improvement of
ACA have been applied to a broader range of biliometric studies, notably document
co-citation analysis (DCA) (Chen, 2004; Small, 1973, 1999). As we will see shortly,
the adaptation of network modeling and information visualization techniques in
general results from a Stage III specialty of information visualization and visual
analytics.
4.3.3 Cluster #2 – Research Evaluation
Cluster #2 is the third largest cluster with 190 cited references and a silhouette
value of 0.845, which is slightly higher than the previous two larger clusters #0 and
#1, suggesting a higher homogeneity. In other words, one would consider this
specialty a more specialized than the previously identified specialties. This cluster
is active over a 16-year period from 2000 till 2015. It represents an active specialty.
Figure 16. Citing articles to Cluster #1.
25
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
The overarching theme of the cluster (Figure 17) is suggested by the two major
branches shown in the hierarchy of key terms of this cluster (Figure 18): the
information visualization branch and the much larger branch of research evaluation.
The information visualization branch highlights the recurring themes of intellectual
structure and co-citation analysis. The research evaluation branch highlights
numerours concpets that are central to measuring scholarly impact, notably h-index,
bibliometric ranking, bibliometric indicator, sub-field normalization, web indicator,
citation distribution, social media metrics, and alternative metrics.
Figure 17. A hierarchy of key concepts in Cluster #2.
Figure 18. High impact members of Cluster #2.
The 6-year period from 2005 through 2010 is a highly active period of the cluster.
The most prominent contributions in this period (Figure 19) include the original
article that introduces the now widely known h-index (Hirsch, 2005), the subsequent
introduction of g-index as a refinement by taking citations into account (Egghe,
2006), a 2007 study that compares the impact of using the Web of Science, Scopus,
and Google Scholar on citation-based ranking (Meho & Yang, 2007), a 2008 review
entitled “What do citation counts measure?” (Bornmann & Daniel, 2008), and a
study of the universality of citation distributions (Radicchi, Fortunato, & Castellano,
Journal of Data and Information Science Vol. 2 No. 2, 2017
26
Expert Review
Journal of Data and
Information Science
2008). These papers are also among the top sigma ranked members of this cluster
because of their structural centrality as well as the strength of their citation burstness.
The top 20 citing articles of the cluster (Figure 20) reveal a considerable level of
thematic consistency. The overarching theme of research evaluation is evidently
behind all these articles with popular title terms identified by latent semantic
indexing such as citation impact, scientific impact, impact measures, bibliometric
indicators, research evaluation, and Web indicators.
Figure 20. Citing articles of Cluster #2.
Some of the more recent and highly cited members in Cluster #2 include a
comparative study of 11 altmetrics and counterpart articles matched in the Web of
Science (Thelwall et al., 2013), the Leiden manifesto for research metrics (Hicks
Figure 19. High impact members of Cluster #2.
27
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
et al., 2015), and a study of power law properties in citation distributions based on
over 6 millions of Scopus records (Brzezinski, 2015).
4.3.4 Cluster #3 – Information Visualization and Visual Analytics
Cluster #3 is the fourth largest cluster. Its duration ranges from 2004 through
2014. The topic hierarchy (Figure 21) has two branches: information visualization
and heart rate variability (Figure 22). The heart rate variability does not belong to
the domain analysis in the context of information science. In fact, its inclusion in
the original results of the topic search was due to the ambiguity of the term domain
analysis across multiple disciplines. Pragmatically it is easier and more efficient to
simply skip an irrelevant branch than keep refining the original topic search query
untill all noticeable irrelevant topics are eliminated. This is one of the foundatmental
challenges for information retrieval and this is where domain analysis has an
instrumental role to play (Hjørland, 2002).
Figure 21. A hierarchy of key concepts in Cluster #3.
Figure 22. High impact members of Cluster #3.
The information visualization branch includes a mixture of information
visualization techniques such as fisheye view, group drawing, graph visualization,
and visual analytics and topics that are center to information science such as citation
analysis, information retrieval. The mixture is a sign of attempts to apply information
visualization and visual analytic techniques to bibliometric approaches to the study
Journal of Data and Information Science Vol. 2 No. 2, 2017
28
Expert Review
Journal of Data and
Information Science
of intellectual structure of a research domain. The vision of information visualization
is to identify insightful patterns from abstract information (Card, Mackinlay, &
Shneiderman, 1999; Chen, 2005; 2010). The subsequently emerged visual analytics
emphasizes the critical and more specific role of sense-making and analytic reasoning
in accomplishing such goals (Chen, 2008; Keim et al. , 2008; Thomas & Cook, 2005).
High-impact contributions in Cluster #3 (Figure 23) include the collection of
seminal works in information visualization (Card, Mackinlay, & Shneiderman,
1999), a survey of graph visualization techniques (Herman, Melançon, & Marshall,
2000), Cytoscape – a widely used software tool for visualizing biomolecular
interaction networks (Shannon et al., 2003), the ground breaking work of visual
analytics (Thomas & Cook, 2005), Many Eyes – the popular web-based visualization
platform (Viégas et al., 2007), and a framework of seven types of interaction
techniques in information visualization (Yi et al., 2007).
Figure 23. Key members of Cluster #3.
In addition to the above high-impact contributions, this cluster features
information visualization tools (Figure 24) such as the InfoVis toolkit (Fekete,
2004), NodeTrix (Henry, Fekete, & McGuffin, 2007), Jigsaw – a visual analytic tool
(Stasko, Gorg, & Liu, 2008), and D3 (Bostock, Ogievetsky, & Heer, 2011). The
most widely used information visualization tools such as Many Eyes and D3 became
available between 2007 and 2011.
According to Shneider’s four stage model, the information visualization and
visual analytics specialty in the context of domain analysis and literature visualization
has demonstrated properties of a Stage IV specialty. For example, in the most recent
few years of the cluster, researchers reflect on empirical evaluations of information
visualization in various scenarios (Lam et al., 2012), revisit taxonomic organizations
of abstract visualization tasks (Brehmer & Munzner, 2013), and synthesize and
codify domain knowledge in the forms of textbooks (Munzner, 2014).
29
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
4.3.5 Remaining Clusters
The remaining clusters are either relatively small in size or short in terms of the
length of their duration. We will omit detailed discussions of these clusters. Readers
may refer to supplementary materials provided on the project website. We outline
a few more relevant clusters as follows.
Cluster #4 focuses on applications of bibliometric studies to research domains
such as decision support systems and information retrieval studies. Top cited
references in this cluster are mostly articles published in the early 1990s.
Cluster #6 focuses on webometrics, led by an article on methdological approaches
to webometrics (Almind & Ingwersen, 1997). This cluster was active during the
period between 1995 and 2006. The leading contributors of this specialty such
as Mike Thelwall continue to make active contributions to Cluster #2 Research
Evaluation, especially in association with the development of altmetrics. A review
of scholarly communication and bibliometrics by Borgman and Furner (2002) is
also a key member of this cluster.
4.4 Trajectories of Citations across Cluster Boundaries
Cluster analysis helps us to understand the major specialties associated with
science mapping. Now we turn our attention to the trajectories of several leading
contributors in the landscape of these clusters. We are interested in what we may
learn from citation links made in publications of a scholar, especially those links
bridging distinct clusters.
Figure 24. Citing articles of Cluster #3.
Journal of Data and Information Science Vol. 2 No. 2, 2017
30
Expert Review
Journal of Data and
Information Science
4.4.1 Trajectories of Prolific Authors
The first example is the citation trajectory of Howard White (Figure 25). He is
the author of several seminal papers featured in several clusters. His citation
trajectories move across the citation landscape from the left to the center, ranging
from #4 decision support system (applications of ACA), #1 domain visualization
(domain analysis), and #8 social work (another cluster of bibliometric studies).
Figure 25. Novel co-citations made by eight papers of White (left) and by 14 papers of Thelwall (right).
The second example is the citation trajectory of Mike Thelwall (Figure 25). He
is a prolific researcher who contributed to webometrics and altmetrics among other
areas of bibliometrics. An overlay of his citation trajectories on a citation landscape
view shows that his trajectories spanning clusters such as #6 university websites
(webometrics) and # google scholar (research evaluation).
In both examples of citatoin trajectories, we have observed that their citation
trajectories span across a wide area over the citation landscape. Monitoring the
movement of citation trajectories in such a way provides an intuitive insight into
the evolution of the underlying specialties and the context in which high-impact
researchers make their contributions.
4.4.2 Articles with Transformative Potentials
It is widely known that a major limitation of any citation-based indicators is their
reliance on citations accumulated over time. Thus, citation-based indicators are
likely to overlook newly published articles. An alternative method is to focus on the
extent to which a newly published article brings to the conceptual structure of the
knowledge domain of interest (Chen, 2012). The idea is to identify the potential of
an article to make extrordinary or unexpected connections across distinct clusters.
According to theories of scientific discovery, many significant contributions are
resulted from boundary spanning ideas.
31
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
Table 3 lists three articles each year for the last five years. These articles have
the highest geometric mean of three structural variation varables generated by
CiteSpace. For example, in 2016, the highest score goes to the review of citation
impact indicators by Waltman (2016), followed by two bibliometric analyses – one
contrasts two closely related but distinct domains and the other studies the research
over a 20-year span (Figure 26). In 2015, two bibliometric studies followed by a
review of theory and practice in scientometrics (Mingers & Leydesdorff, 2015).
These highly ranked articles represent a few types of studies that may serve
as predictive indicators, namely review papers (Mingers & Leydesdorff, 2015;
Waltman, 2016), applications of bibliometric studies to specific domains, software
tools for science mapping (Cobo et al., 2012), new metrics and indicators (Li et al.,
2013), and visual analytic studies of unconventional topics – retractions (Chen
et al., 2013).
4.5 The Emergence of a Specialty
The emergence of a specialty is determined by two factors: the intellectual base
and the research fronts associated with the intellectual base. The intellectual base
is what the specialty cites, whereas the research fronts are what the specialty is
currently addressing. As we have seen, on the one hand, a research front may remain
in the same co-citation cluster as in the case of Cluster #2 Research Evaluation. On
the other hand, a research front may belong to a different specialty and become the
intellectual base of a new specialty as in the case of Cluster # Domain Analysis and
Cluster #0 Bibliometric Mapping.
The citation trajectories of a researcher’s publications and the positions of these
publications as cited references can be simultaneously shown by overlaying
trajectories (dashed lines for novel links or solid lines for existing links) and citing
papers as stars if they also appear in a co-citation cluster as cited references. For
example, the series of stars in the following visualization (Figure 27) tell us two
things: First, the author is connecting topics in two clusters (Cluster #0 Science
Mapping and Cluster 2 Research Evaluation) and second, the author belongs to the
specialty of science mapping.
The example below (Figure 28) illustrates the citation trajectories of Howard
White’s publications and their own positions in the timelines of clusters. His
publications appear in the early stage of the science mapping cluster (#0) and make
novel connections between science mapping and domain analysis (Cluster #1),
domain analysis (Cluster #1) and applications of ACA (Cluster #4), domain analysis
(Cluster #1) and webometrics (Cluster #6).
The next example (Figure 29) depicts the novel co-citation links made by a
review paper of informetrics (Bar-Ilan, 2008). These novel links include within-
cluster links as well as between-cluster links. It should be easy to tell that the scope
Journal of Data and Information Science Vol. 2 No. 2, 2017
32
Expert Review
Journal of Data and
Information Science
Table 3. Potentially transformative papers published in recent years (2012–2016).
Year MCLwCKL
Geometric
Mean GC Title Reference
2016 6.0541 0.0152 0.0251 0.1322 5 A review of the literature on citation impact indicators (Waltman, 2016)
2016 0.9235 0.0019 0.3407 0.0842 0 How are they different? A quantitative domain comparison
of information visualization and data visualization
(2000–2014)
(Kim, Zhu, & Chen, 2016)
2016 0.8207 0.0017 0.0640 0.0447 2 A bibliometric analysis of 20 years of research on software
product lines
(Heradio et al., 2016)
2015 1.7498 0.0073 0.0380 0.0786 0 Global ontology research progress: A bibliometric analysis (Zhu et al., 2015)
2015 1.9873 0.0052 0.0397 0.0743 9 Bibliometric methods in management and organization (Zupic, 2015)
2015 1.9906 0.0029 0.0238 0.0516 13 A review of theory and practice in scientometrics (Mingers & Leydesdorff, 2015)
2014 1.6240 0.0087 0.0434 0.0850 3 Research dynamics: Measuring the continuity and
popularity of research topics
(Yan, 2014)
2014 1.1837 0.0031 0.0463 0.0554 1 Making a mark: A computational and visual analysis of
one researcher’s intellectual domain
(Skupin, 2014)
2014 0.4462 0.0024 0.0270 0.0307 12 The knowledge base and research front of information
science 2006–2010: An author cocitation and bibliographic
coupling analysis
(Zhao & Strotmann, 2014)
2013 2.5398 0.0112 0.0643 0.1223 13 Analysis of bibliometric indicators for individual scholars
in a large data set
(Radicchi & Castellano, 2013)
2013 1.0781 0.0065 0.2180 0.1152 6 A visual analytic study of retracted articles in scientific
literature
(Chen et al., 2013)
2013 1.7978 0.0064 0.0542 0.0854 24 Quantitative evaluation of alternative field normalization
procedures
(Li et al., 2013)
2012 3.6274 0.0107 0.0811 0.1466 29 SciMAT: A new science mapping analysis software tool (Cobo et al., 2012)
2012 3.4380 0.0248 0.0259 0.1302 15 A forward diversity index (Carley & Porter, 2012)
2012 1.0719 0.0032 0.0321 0.0479 11 Visualizing and mapping the intellectual structure of
information retrieval
(Rorissa & Yuan, 2012)
33
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
of the review is essentially limited to research papers published about 6–7 years
prior to the time of the review. Furthermore, we can see that the review systematically
emphasizes the diversity of topics instead of tracing to the origin of any particular
specialty.
5 Discussions and Conclusions
We present a visual domain analysis of the science mapping research. Our
intention is twofolds. First, our goal is to demonstrate the depth of a systematic
review that one can reach by applying a science mapping approach to itself. In
addition to the application of computational functions available in the new version
of the CiteSpace software, we enrich the procedure of producing a systematic review
of a knowledge domain by incorporating evolutionary models of a scientific
specialty, especially the four-stage model of a scientific discipline, into the
interpretation of the identified specialties. Our interpretation not only identifies
thematic milestones of major streams of science mapping research, but also
Figure 26. Three examples of articles with high modularity change rates: 1) (Waltman, 2016), 2) (Zupic,
2015), and 3) (Zhu et al., 2015).
Journal of Data and Information Science Vol. 2 No. 2, 2017
34
Expert Review
Journal of Data and
Information Science
Figure 27. Stars indicate articles that are both cited and citing articles. Dashed lines indicate novel co-citation
links. Illustrated based on 15 papers of the author’s own publications.
characterizes the developmental stages of the underlying specialties and the dynamics
of transitions from one specialty to another.
Second, our goal is to provide a reliable historiographic survey of the science
mapping research. The survey identifies the major clusters in terms of their high-
impact members and citing articles that form new research fronts. We also
demonstrate new insights that one can intuitively obtain through an inspection of
citation trajectories and the positions of citing papers. The enhanced science mapping
procedure introduced in this article is applicable to the analysis of other domains of
interest. Researchers can utilize these visual analytic tools to perform timely surveys
of the literature as frequently as they wish and find relevant publications more
effectively.
The most active areas of scientific inquiries are also where the level of uncertainty
is the highest (Chen, 2016; Fuchs, 1993). The evidence revealed in our study
suggests that science mapping is a Stage III specialty. Research instruments become
increasingly powerful and accessible. A wider range of applications of existing
techniques will in turn widen our horizen and deepen our understanding of the
challenges that we need to overcome in order to advance the state of the art of
science mapping.
35
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
Figure 28. Citation trajectories of Howard White’s publications and their own locations.
Figure 29. Novel links made by a review paper of informetrics (Bar-Ilan, 2008).
Journal of Data and Information Science Vol. 2 No. 2, 2017
36
Expert Review
Journal of Data and
Information Science
References
Ahlg ren, P., Jarneving, B., & Rousseau, R. (2003). Requirements for a cocitation similarity
measure, with special reference to Pearson’s correlation coefficient. Journal of the American
Society for Information Science and Technology, 54(6), 550–560.
Almi nd, T.C., & Ingwersen, P. (1997). Informetric analyses on the world wide web: Methodologi-
cal approaches to “webometrics”. Journal of Documentation, 53(4), 404–426.
Börn er, K., Chen, C., & Boyack, K.W. (2003). Visualizing knowledge domains. Annual Review of
Information Science and Technology, 37(1), 179–255.
Bar-I lan, J. (2008). Informetrics at the beginning of the 21st century – A review. Journal of
Informetrics, 2(1), 1–52.
Borga tti, S.P., Everett, M.G., & Freeman, L.C. (2002). Ucinet for Windows: Software for social
network analysis. Harvard, MA: Analytic Technologies.
Borgm an, C.L., & Furner, J. (2002). Scholarly communication and bibliometrics. Annual Review
of Information Science and Technology, 36, 3–72.
Bornm ann, L., & Daniel, H.D. (2008). What do citation counts measure? A review of studies on
citing behavior. Journal of Documentation, 64(1), 45–80.
Bosto ck, M., Ogievetsky, V., & Heer, J. (2011). D³: Data-driven documents. IEEE Transactions on
Visualization and Computer Graphics, 17(12), 2301–2309.
Boyack , K.W., Klavans, R., & Börner, K. (2005). Mapping the backbone of science. Scientomet-
rics, 64(3), 351–374.
Boyack, K.W., Wylie, B.N., & Davidson, G.S. (2002). Domain visualization using VxInsight®
for science and technology management. Journal of the American Society for Information
Science and Technology, 53(9), 764–774.
Br ehmer, M., & Munzner, T. (2013). A multi-level typology of abstract visualization tasks. IEEE
Transactions on Visualization and Computer Graphics, 19(12), 2376–2385.
Br zezinski, M. (2015). Power laws in citation distributions: Evidence from Scopus. Scientomet-
rics, 103(1), 213–228.
Ca llon, M., Courtial, J.P., Turner, W.A., & Bauin, S. (1983). From translations to problematic net-
works – an introduction to co-word analysis. Social Science Information Sur Les Sciences
Sociales, 22(2), 191–235.
Ca rd, S., Mackinlay, D.J., & Shneiderman, B. (1999). Readings in information visualization: Using
vision to think. San Francisco, CA: Morgan Kaufmann Publisher.
Ca rley, S., & Porter, A.L. (2012). A forward diversity index. Scientometrics, 90(2), 407–427.
Ch en, C. (1999a). Visualising semantic spaces and author co-citation networks in digital libraries.
Information Processing & Management, 35(2), 401–420.
Ch en, C. (1999b). Visualising semantic spaces and author co-citation networks in digital libraries.
Information Processing & Management, 35(3), 401–420.
Ch en, C. (2004). Searching for intellectual turning points: Progressive knowledge domain visual-
ization. Proceedings of the National Academy of Sciences of the United States of America,
101(suppl.), 5303–5310.
Ch en, C. (2005). Top 10 unsolved information visualization problems. IEEE Computer Graphics
and Applications, 25(4), 12–16.
37
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
Ch en, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in
scientific literature. Journal of the American Society for Information Science and Technology,
57(3), 359–377.
C hen, C. (2008). An information-theoretic view of visual analytics. IEEE Computer Graphics &
Applications, 28(1), 18–23.
Ch en, C. (2010). Information visualization. Wiley Interdisciplinary Review: Computational
Statistics, 2(4), 387–403.
Ch en, C. (2012). Predictive effects of structural variation on citation counts. Journal of the
American Society for Information Science and Technology, 63(3), 431–449.
Ch en, C. (2016). Grand challenges in measuring and characterizing scholarly impact. Frontiers in
Research Metrics and Analytics, 1(4).
Ch en, C., Chen, Y., Horowitz, M., Hou, H., Liu, Z., & Pellegrino, D. (2009). Towards an explana-
tory and computational theory of scientific discovery. Journal of Informetrics, 3(3), 191–209.
Ch en, C., Cribbin, T., Macredie, R., & Morar, S. (2002). Visualizing and tracking the growth of
competing paradigms: Two case studies. Journal of the American Society for Information
Science and Technology, 53(8), 678–689.
Ch en, C., Dubin, R., & Kim, M.C. (2014). Emerging trends and new developments in regenerative
medicine: A scientometric update (2000–2014). Expert Opinion on Biological Therapy, 14(9),
1295–1317.
Ch en, C., Hu, Z., Liu, S., & Tseng, H. (2012). Emerging trends in regenerative medicine: A
scientometric analysis in CiteSpace. Expert Opinions on Biological Therapy, 12(5), 593–608.
Ch en, C., Hu, Z., Milbank, J., & Schultz, T. (2013). A visual analytic study of retracted articles in
scientific literature. Journal of the American Society for Information Science and Technology,
64(2), 234–253.
Ch en, C., Ibekwe-SanJuan, F., & Hou, J. (2010). The structure and dynamics of cocitation clusters:
A multiple-perspective cocitation analysis. Journal of the American Society for Information
Science and Technology, 61(7), 1386–1409.
Ch en, C., & Leydesdorff, L. (2014). Patterns of connections and movements in dual-map overlays:
A new method of publication portfolio analysis. Journal of the American Society for Informa-
tion Science and Technology, 65(2), 334–351.
Ch en, C., & Morris, S. (2003). Visualizing evolving networks: Minimum spanning trees versus
Pathfinder networks. Paper presented at the Proceedings of IEEE Symposium on Information
Visualization, Seattle, Washington.
Co bo, M.J., López-Herrera, A.G., Herrera-Viedma, E., & Herrera, F. (2012). SciMAT: A new
science mapping analysis software tool. Journal of the American Society for Information
Science and Technology, 63(8), 1609–1630.
Dee rwester, S., Dumais, S.T., Landauer, T K., Furnas, G.W., & Harshman, R.A. (1990). Indexing
by latent semantic analysis. Journal of the American Society for Information Science, 41(6),
391–407.
Din g, Y., Chowdhury, G., & Foo, S. (1999). Mapping the intellectual structure of information
retrieval studies: An author co-citation analysis, 1987–1997. Journal of Information Science,
25(1), 67–78.
Egg he, L. (2006). Theory and practice of the g-index. Scientometrics, 69(1), 131–152.
Fek ete, J. (2004). The InfoVis toolkit. Paper presented at the IEEE Symposium on Information
Visualization, Austin, Texas.
Journal of Data and Information Science Vol. 2 No. 2, 2017
38
Expert Review
Journal of Data and
Information Science
Fuc hs, S. (1993). A sociological theory of scientific change. Social Forces, 71(4), 933–953.
Gar field, E. (1955). Citation indexes for science: A new dimension in documentation through
association of ideas. Science, 122(3159), 108–111.
Hen ry, N., Fekete, J.D., & McGuffin, M.J. (2007). NodeTrix: A hybrid visualization of social
networks. IEEE Transactions on Visualization and Computer Graphics, 13(6), 1302–1309.
Her adio, R., Perez-Moragoa, H., Fernandez-Amorosa, D., Cabrerizoa, F.J., & Herrera-Viedmab, E.
(2016). A bibliometric analysis of 20 years of research on software product lines. Information
and Software Technology, 72, 1–15.
Her man, I., Melançon, G., & Marshall, M.S. (2000). Graph visualization and navigation in infor-
mation visualization: A survey. IEEE Transactions on Visualization and Computer Graphics,
6(1), 24–44.
Hick s, D., Wouters, P., Waltman, L., Rijcke, S.D., & Rafols, I. (2015). Bibliometrics: The Leiden
Manifesto for research metrics. Nature, 520(7548), 429–431.
Hirs ch, J.E. (2005). An index to quantify an individual’s scientific research output. Proceedings of
the National Academy of Sciences of the United States of America, 102(46), 16569–16572.
Hjør land, B. (1997). Information seeking and subject representation: An activity-theoretical
approach to information science. Westport, CT: Greenwood Press.
Hjørl and, B. (2002). Epistemology and the socio-cognitive perspective in information science.
Journal of the American Society for Information Science and Technology, 53(4), 257–270.
Johnso n, B., & Shneiderman, B. (1991, October 1991). Tree-maps: A space filling approach to the
visualization of hierarchical information structures. Paper presented at the IEEE Visualization
91.
Keim, D., Mansmann, F., Schneidewind, J., Thomas, J., & Ziegler, H. (2008). Visual analytics:
Scope and challenges. in S.J. Simoff, M.H. Böhlen, & A. Mazeika (Eds.), Visual Data Mining
(pp. 76–90). Berlin: springer-Verlag.
Kim, M .C., Zhu, Y., & Chen, C. (2016). How are they different? A quantitative domain comparison
of information visualization and data visualization (2000–2014). Scientometrics, 107(1), 123.
Kleinb erg, J. (2002). Bursty and hierarchical structure in streams. Paper presented at the Proceed-
ings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, Edmonton, Alberta, Canada. Retrieved on February 19, 2017, from http://www.cs.
cornell.edu/home/kleinber/bhs.pdf.
Kuhn, T.S. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press.
Lam, H ., Bertini, E., Isenberg, P., Plaisant, C., & Carpendale, S. (2012). Empirical studies in infor-
mation visualization: Seven scenarios. IEEE Transactions on Visualization and Computer
Graphics, 18(9), 1520–1536.
Leydes dorff, L., & Rafols, I. (2009). A global map of science based on the ISI subject categories.
Journal of the American Society for Information Science and Technology, 60(2), 348–362.
Li, Y. , Radicchi, F., Castellano, C., & Ruiz-Castillo, J. (2013). Quantitative evaluation of alterna-
tive field normalization procedures. Journal of Informetrics, 7(3), 746–755.
Meho, L.I., & Yang, K. (2007). Impact of data sources on citation counts and rankings of LIS
faculty: Web of science versus scopus and google scholar. Journal of the American Society for
Information Science and Technology, 58(13), 2105–2125.
39
Chaomei Chen
Expert Review
Journal of Data and
Information Science
Science Mapping: A Systematic Review of the Literature
http://www.jdis.org
https://www.degruyter.com/view/j/jdis
Miloje vić, S., Sugimoto, C.R., Yan, E., & Ding, Y. (2011). The cognitive structure of Library
and Information Science: Analysis of article title words. Journal of the American Society for
Information Science and Technology, 62(10), 1933–1953.
Mingers , J., & Leydesdorff, L. (2015). A review of theory and practice in scientometrics. European
Journal of Operational Research, 246(1), 1–19.
Morris, S.A., Yen, G., Wu, Z., & Asnake, B. (2003). Timeline visualization of research fronts.
Journal of the American Society for Information Science and Technology, 55(5), 413–422.
Mulkay, M.J., Gilbert, G.N., & Woolgar, S. (1975). Problem areas and research networks in
science. Sociology, 9(2), 187–203.
Munzner , T. (2014). Visualization analysis and design. Natick, MA: A K Peters/CRC Press.
Nerur, S.P. (2008). The intellectual structure of the strategic management field: An author
co-citation analysis. Strategic Management Journal, 29(3), 319–336.
Quirin, A., Cordón, O., Guerrero-Bote, V.P., Vargas-Quesada, B., & Moya-Anegón, F. (2008). A
quick MST-based algorithm to obtain pathfinder networks (, n 1). Journal of the American
Society for Information Science and Technology, 59(2), 1912–1924.
Radicchi, F. , & Castellano, C. (2013). Analysis of bibliometric indicators for individual scholars in
a large data set. Scientometrics, 97(3), 627–637.
Radicchi, F. , Fortunato, S., & Castellano, C. (2008). Universality of citation distributions: Toward
an objective measure of scientific impact. Proceedings of the National Academy of Sciences
of the United States of America, 105(45), 17268–17272.
Ramos-rodrig uez, A.R. (2004). Changes in the intellectual structure of strategic management
research: A bibliometric study of the Strategic Management Journal, 1980–2000. Strategic
Management Journal, 25(10), 981–1004.
Rorissa, A., & Yuan, X. (2012). Visualizing and mapping the intellectual structure of information
retrieval. Information Processing & Management, 48(1), 120–135.
Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., .. . Ideker, T. (2003).
Cytoscape: A software environment for integrated models of biomolecular interaction
networks. Genome Research, 13(11), 2498–2504.
Shneider, A. M. (2009). Four stages of a scientific discipline: Four types of scientists. Trends in
Biochemical Sciences, 34(5), 217–223.
Skupin, A. ( 2014). Making a mark: A computational and visual analysis of one researcher’s intel-
lectual domain. International Journal of Geographical Information Science, 28(6), 1209–1232.
Small, H. (1 973). Co-citation in the scientific literature: A new measure of the relationship between
two documents. Journal of the American Society for Information Science, 24(4), 265–269.
Small, H. (1 999). Visualizing science by citation mapping. Journal of the American Society for
Information Science and Technology, 50(9), 799–813.
Stasko, J., Gorg, C., & Liu, Z. (2008). Jigsaw: Supporting investigative analysis through interactive
visualization. Information Visualization, 7(2), 118–132.
Tabah, A.N. (1999). Literature dynamics: Studies on growth, diffusion, and epidemics. Annual
Review of Information Science and Technology, 34(1), 249–286.
Thelwall, M. , Haustein, S., Larivière, V., & Sugimoto, C.R. (2013). Do altmetrics work? Twitter
and ten other social web services. PLoS ONE, 8(5), e64841.
Journal of Data and Information Science Vol. 2 No. 2, 2017
40
Expert Review
Journal of Data and
Information Science
Thomas, J.J., & Cook, K.A. (2005). Illuminating the path: The research and development agenda
for visual analytics. Los Alamitos,CA: IEEE Computer Society Press.
Tibély, G., P ollner, P., Vicsek, T., & Palla, G. (2013). Extracting tag-hierarchies. PLoS ONE, 8(12),
e84133.
van Eck, N.J. , & Waltman, L. (2010). Software survey: VOSviewer, a computer program for
bibliometric mapping. Scientometrics, 84(2), 523–538.
Van Raan, A.F J. (2003). Sleeping beauties in science. Scientometrics, 59(3), 461–466.
Viégas, F.B., Wattenberg, M., Ham, F.v., Kriss, J., & McKeon, M. (2007). Many eyes: A site for
visualization at Internet scale. IEEE Transactions on Visualization and Computer Graphics,
13(6), 1121–1128.
Vogel, R., & Gü ttel, W.H. (2013). The dynamic capability view in strategic management: A biblio-
metric review. International Journal of Management Reviews, 15(4), 426–446.
Waltman, L. (201 6). A review of the literature on citation impact indicators. Journal of Informet-
rics, 10(2), 365–391.
White, H.D. (200 3). Pathfinder networks and author cocitation analysis: A remapping of paradig-
matic information scientists. Journal of the American Society for Information Science and
Technology, 54(5), 423–434.
White, H.D., & M cCain, K.W. (1997). Visualization of literatures. Annual Review of Information
Science and Technology, 32, 99–168.
White, H.D., & M cCain, K.W. (1998). Visualizing a discipline: An author co-citation analysis of
information science, 1972–1995. Journal of the American Society for Information Science,
49(4), 327–356.
White, H.D., & M cCain, K.W. (1998). Visualizing a discipline: An author co-citation analysis of
information science, 1972–1995. Journal of the American Society for Information Science
and Technology, 49(4), 327–355.
Yan, E. (2014). Research dynamics: Measuring the continuity and popularity of research topics.
Journal of Informetrics, 8(1), 98–110.
Yi, J.S., Kang, Y.A., Stasko, T.J., & Jacko, A.J. (2007). Towards a deeper understanding of the role
of interaction in information visualization. IEEE Transactions on Visualization and Computer
Graphics, 13(6), 1224–1231.
Zhao, D., & Stro tmann, A. (2014). The knowledge base and research front of information
science 2006–2010: An author cocitation and bibliographic coupling analysis. Journal of the
Association for Information Science and Technology, 65(5), 995–1006.
Zhu, Q., Kong, X ., Hong, S., Li, J., & He, Z. (2015). Global ontology research progress: a biblio-
metric analysis. Aslib Journal of Information Management, 67(1), 27–54.
Zupic, I. (2015) . Bibliometric methods in management and organization. Organizational Research
Methods, 18(3), 429–472.
This is an open access article licensed under the Creative Commons Attribution-NonCommercial-
NoDerivs License (http://creativecommons.org/licenses/by-nc-nd/4.0/).
... In this paper, the bibliometric analysis based on CiteSpace is used to explore the potential information contained in the current groundwater salinization research by drawing the corresponding knowledge context map to provide intuitive research information for researchers (Chen 2006(Chen , 2017Chen and Leydesdorff 2014). ...
... Keywords reflect the refinement and concentration of the core content, and repeated keywords in different studies can be regarded as research hotspots in a research field. Hotspots and development trends in the research field can be obtained by analyzing the co-occurrence keywords, word frequency and the connection strength between them (Chen 2017). In this section, the co-occurrence keywords frequency was statistically analyzed. ...
Article
Full-text available
As a global concern, the issue of groundwater salinization refers to the phenomenon of an increase in the overall chemical content over background levels in the groundwater. It involves a long-term process that could degrade groundwater quality and restrict its availability for drinking, irrigation and industry. For the effective protection and further research of groundwater resources, policy strongly depends on understanding the development, hotspots and trend directions of groundwater salinization research, which involves the degree, sources and processes of global groundwater salinization. However, such a comprehensive and systematic analysis has not been performed, and it is difficult to have a deeper understanding of groundwater salinization. The purpose of this paper is to analyze the knowledge structure, hot topics and trends in the field of groundwater salinization based on 6651 Web of Science (WoS) publications combined with CiteSpace for in-depth bibliometric and visual analysis. The results showed that 292 institutions in 125 countries have published articles in this field from 1970 to 2021. The USA was one of the most prolific contributors, with the largest number of publications and active institutions. Cooperation among authors has become frequent in recent years, and they tend to cooperate in groups. According to the analysis of co-occurrence keywords and co-cited articles, “water resources”, “sea level rise” and “variable density flow” were identified as three hot topics. A keyword burst analysis revealed the emerging trends of concerns about global climate change and the sustainable utilization of water resources. In addition, the possible opportunities and challenges were explored that may be faced in groundwater salinization research. The outcomes of this study are significant for future research on groundwater management and pollution control. Graphical abstract
... CiteSpace is a free Java-based bibliometric software package developed by Dr. Chaomei Chen (14). It provides multiple functions for bibliometric studies including collaboration network analysis, co-citation analysis, and cooccurrence analysis (15). A series of knowledge maps generated by Citespace helps one to explore the research frontiers and evolution of a scientific domain, and reveals the collaboration characteristics of institutions and authors in research fields (16). ...
Article
Full-text available
IntroductionAbnormal neurotransmission of glutamate and γ-aminobutyric acid (GABA) is a key characteristic of alcohol-related disorders. To track research output, we conducted a bibliometric analysis to explore the current status and trends in this field over the past decades.Methods Studies related to neurotransmitters and alcohol use disorder published in English from 2005–2021 were retrieved from the Web of Science Core Collection and Scopus databases. The R–bibliometrix package was used for a descriptive analysis of the publications. Citespace, WOSviewer, and R–bibliometrix were used to construct networks of countries/institutions/authors based on co-authorship, co-citation analysis of cited references and co-occurrence as well as burst detection of keywords.ResultsA total of 4,250 unique articles and reviews were included in the final analysis. The annual growth rate of publications was 5.4%. The USA was the most productive country in this field, contributing nearly half of the total documents. The top ten most productive institutions were all located in the USA. The most frequent worldwide collaboration was between the USA and Italy. The most productive and influential institution was the University of California. The author contributing the most productions to this field was Marisa Roberto from the Scripps Research Institute. The top co-cited reference was a review titled “Neurocircuitry of addiction.” The top journal in terms of the number of records and citations was Alcoholism: Clinical and Experimental Research. Comprehensive analyses have been conducted over past decades based on co-cited reference analysis, including modulators, transporters, receptor subtypes, and animal models. In recent years, the research frontiers have been shifting to the identification of risk factors/biomarkers, drug development for alcohol use disorder, and mechanisms related to alcoholic and non-alcoholic fatty liver.Conclusion Our bibliometric analysis shows that glutamate and GABA continue to be of interest in alcohol use disorder. The focus has evolved from mechanisms and medications related to glutamate and GABA in alcohol use disorder, to novel drug development, risk factor/biomarker identification targeting neurotransmitters, and the mechanisms of related diseases.
... They are applied to analyze the frontiers of a topic or research field from macro-to micro-perspectives, which includes elements such as countries, institutions, authors, keywords, and journals (Raparelli and Bajocco, 2019;Zhang et al., 2019). These tools integrate computer engineering, big data applications, and statistics, and are widely applied in many fields (Chen, 2017) to provide rich assessments and analyses in different areas. The advantages of bibliometrics are reflected in stronger analytical efficiency for keyword analysis, research hotspot frontiers, and reference co-occurrence analysis. ...
Article
Full-text available
Multispectral technology has a wide range of applications in agriculture. By obtaining spectral information during crop production, key information such as growth, pests and diseases, fertilizer and pesticide application can be determined quickly, accurately and efficiently. The scientific analysis based on Web of Science aims to understand the research hotspots and areas of interest in the field of agricultural multispectral technology. The publications related to agricultural multispectral research in agriculture between 2002 and 2021 were selected as the research objects. The softwares of CiteSpace, VOSviewer, and Microsoft Excel were used to provide a comprehensive review of agricultural multispectral research in terms of research areas, institutions, influential journals, and core authors. Results of the analysis show that the number of publications increased each year, with the largest increase in 2019. Remote sensing, imaging technology, environmental science, and ecology are the most popular research directions. The journal Remote Sensing is one of the most popular publishers, showing a high publishing potential in multispectral research in agriculture. The institution with the most research literature and citations is the USDA. In terms of the number of papers, Mtanga is the author with the most published articles in recent years. Through keyword co-citation analysis, it is determined that the main research areas of this topic focus on remote sensing, crop classification, plant phenotypes and other research areas. The literature co-citation analysis indicates that the main research directions concentrate in vegetation index, satellite remote sensing applications and machine learning modeling. There is still a lot of room for development of multi-spectrum technology. Further development can be carried out in the areas of multi-device synergy, spectral fusion, airborne equipment improvement, and real-time image processing technology, which will cooperate with each other to further play the role of multi-spectrum in agriculture and promote the development of agriculture.
... Considering that the two kinds of software of VOSviewer and Citespace can draw maps with a large amount of information and good visual effect, they can provide scientific research perspectives from different aspects. The primary motivation for using CiteSpace in this research was to simplify the search for essential papers in the knowledge domain documents so that visually significant characteristics can be searched in the optical network, and visual aids can be provided to identify the changes between adjacent nodes [48][49][50]. CiteSpace is a Java application developed by Professor Chen Chaomei of Drexel University in the United States for the visual analysis of the co-occurrence networks [51], which can effectively reveal the hot spots, trends, and development evolution of specific research fields [3]. Based on the above advantages, CiteSpace can be used for scientific research cooperation analysis, research hotspots analysis, and frontier trend analysis. ...
Article
Full-text available
Landslide prediction is one of the complicated topics recognized by the global scientific community. The research on landslide susceptibility prediction is vitally important to mitigate and prevent landslide disasters. The instability and complexity of the landslide system can cause uncertainty in the prediction process and results. Although there are many types of models for landslide susceptibility prediction, they still do not have a unified theoretical basis or accuracy test standard. In the past, models were mainly subjectively selected and determined by researchers, but the selection of models based on subjective experience often led to more significant uncertainty in the prediction process and results. To improve the universality of the model and the reliability of the prediction accuracy, it is urgent to systematically summarize and analyze the performance of different models to reduce the impact of uncertain factors on the prediction results. For this purpose, this paper made extensive use of document analysis and data mining tools for the bibliometric and knowledge mapping analysis of 600 documents collected by two data platforms, Web of Science and Scopus, in the past 40 years. This study focused on the uncertainty analysis of four key research subfields (namely disaster-causing factors, prediction units, model space data sets, and prediction models), systematically summarized the difficulties and hotspots in the development of various landslide prediction models, discussed the main problems encountered in these four subfields, and put forward some suggestions to provide references for further improving the prediction accuracy of landslide disaster susceptibility.
... The main purpose of structural variation analysis is to detect unprecedented inter-cluster bridges or new types of remote connections and understand why specific connections are novel and valuable capabilities. The idea is to identify the potential of an article to establish extraordinary or unexpected connections across distinct clusters (Chen 2017). Keywords are the high-level summary of the thesis, and their frequency and degree of relevance can reveal the research hotspots and internal connections in a certain field (Zhu et al. 2021). ...
Article
Full-text available
The planetary boundaries concept has triggered a vast amount of pure and applied scientific research, as well as policy and governance activities globally. Indeed, it has rapidly become a centerpiece of sustainability study. It is crucial to review the scientific state of the planetary boundaries (PB) concept systematically. However, there is a lack of research on drawing a scientific investigation map of planetary boundaries. Therefore, to clarify the spatial and temporal distribution characteristics, research hotspots, and frontiers of planetary boundaries, a scientometric analysis was performed based on 530 academic publications on planetary boundaries from 2009 to 2021. This paper conducted the analysis by visualizing the social network, dual-map overlay, co-cited references, structure variation article, and co-occurrence keywords with CiteSpace. The results show that as a new achievement and paradigm in sustainable development research, the planetary boundaries framework is gradually getting global attention and promotion, which has increasingly become an interdisciplinary hot research topic. The most productive authors and institutions are concentrated in England, the USA, Germany, and Sweden. Relevant articles were mainly published in journals focusing on ecology, earth, marine, veterinary, animal, economics, and politics. In addition, we summarized four predominant research themes by clustering keywords: the calculation of single boundary threshold and present value, the integration with assessment methods such as life cycle assessment and footprint families, the downscaling of planetary boundaries, and the expansion to economic and social domains. For scholars who are interested in this topic, this paper would be a useful reference and guideline.
... In terms of temporal metrics, two are of particular interest: citation burstness and sigma. Citation burstness, computed using Kleinberg's algorithm (Kleinberg, 2003), is an index of an abrupt change in the number of citations received by a node within a period of time (Chen, 2017). Possible values of citation burstness go from 0 to infinite. ...
Article
Purpose Among the growing interest towards market segmentation and targeted marketing, the current study adopted a scientometric approach to examine the literature on wine selection and preferences. The current review specifically attempts to shed light on the research that explores the determinants of wine preferences at multiple levels of analysis. Design/methodology/approach CiteSpace was used to compute a Document Co-Citation Analysis (DCA) on a sample of 114,048 eligible references obtained from 2,846 publications downloaded from Scopus on 24 May 2021. Findings An optimized network of 1,505 nodes and 4,616 links was generated. Within the network, impactful publications on the topic and thematic domains of research were identified. Specifically, two thematic macro-areas were identified through a qualitative analysis of papers included in the 7 major clusters. The first one - “Methods of Wine Making” - included clusters #0, #3, #5, #6 and #18. The second one - “Consumers' Attitudes and Preferences Towards Wine” - included clusters #1 and #2. The first thematic macro-area included more technical aspects referring to the process of wine making, while the second thematic macro-area focused more on the factors influencing individuals' preferences and attitudes towards wine. To reflect the aims of the current paper, publications giving light to the “Consumers' Attitudes and Preferences Towards Wine” macro-area were analyzed in detail. Originality/value The resulting insights may help wine makers and wine sellers optimize their work in relation to market segments and to the factors influencing individuals' purchasing behaviors.
Article
Surface-enhanced Raman scattering (SERS) has received increasing attention from researchers since it was first discovered on rough silver electrode surfaces in 1974 and has promising applications in life sciences, food safety, and environmental monitoring. The discovery of graphene has stirred considerable waves in the scientific community, attracting widespread attention in theoretical research and applications. Graphene exhibits the properties of a semi-metallic material and has also been found to have Raman enhancement effects such as in metals. At the same time, it quenches the fluorescence background and improves the ratio of a Raman signal to a fluorescence signal. However, graphene single-component substrates exhibit only limited SERS effects and are difficult to use for trace detection applications. The common SERS substrates based on noble metals such as Au and Ag can produce strong electromagnetic enhancement, which results in strong SERS signals from molecules adsorbed on the surface. However, these substrates are less stable and face the challenge of long-term use. The combination of noble metals and graphene to obtain composite structures was an effective solution to the problem of poor stability and sensitivity of SERS substrates. Therefore, graphene-based SERS has been a popular topic within the last decade. This review presents a statistically based analysis of graphene-based SERS using bibliometrics. Journal and category analysis were used to understand the historical progress of the topic. Geographical distribution was used to understand the contribution of different countries and institutions to the topic. In addition, this review describes the different directions under this topic based on keyword analysis and keyword co-occurrence. The studies on this topic do not show a significant divergence. The researchers’ attention has gradually shifted from investigating materials science and chemistry to practical sensing applications. At the end of the review, we summarize the main contents of this topic. In addition, several perspectives are presented based on bibliometric analysis.
Article
Purpose This study aims to investigate the knowledge structure and research trends in child and adolescent health literacy research over the past 30 years by using network text analysis. Method The study was conducted in four steps: 1) collecting abstracts, 2) keyword extraction and preprocessing, 3) creation of co-occurrence matrix, and 4) text network analysis. Results As a result of the centrality analysis, the upper-ranked core keywords were “health care,” “health behavior,” “prevention,” “treatment,” and “health promotion.” In contrast to earlier times, “barrier,” “caregiver,” “school setting,” and “QOL” have recently emerged as core keywords. Conclusions Over the past 30 years, child and adolescent health literacy has been importantly addressed in both clinical and public health approaches. The results of the current study provide references for future research in child and adolescent health literacy. Implication for practice and research This study suggests the need for additional research on school-based practices that can improve health literacy and for identifying validated and reliable multidimensional health literacy assessment instruments.
Article
Purpose This paper aims to examine the global pattern of growth and development of eHealth research based on publication headcount, and analysis of the characteristics, of the keywords used by authors and indexers to represent their research content during 1945–2019. Design/methodology/approach This study adopted a bibliometric research design and a quantitative approach. The source of the data was Elsevier’s Scopus database. The search query involved multiple search terms because researchers’ choice of keywords varies very significantly. The search for eHealth research publications was limited to conference papers and research articles published before 2020. Findings eHealth originated in the late 1990s, but it has become an envelope term for describing much older terms such as telemedicine, and its variants that originated much earlier. The keywords were spread through the 27 Scopus Subject Areas, with medicine (44.04%), engineering (12.84%) and computer science (11.47%) leading, while by Scopus All Science Journal Classification Health Sciences accounted for 55.83% of the keywords. Physical sciences followed with 30.62%. The classifications social sciences and life sciences made only single-digit contributions. eHealth is about meeting health needs, but the work of engineers and computer scientists is very outstanding in achieving this goal. Originality/value This study demonstrates that eHealth is an unexplored aspect of health literature and highlights the nature of the accumulated literature in the area. It further demonstrates that eHealth is a multidisciplinary area that is attractive to researchers from all disciplines because of its sensitive focus on health, and therefore requires pooling and integration of human resources and expertise, methods and approaches.
Article
Full-text available
Radiotherapy for rectal cancer has received increasing research attention in recent years; however, no bibliometric assessment has been conducted on the progress of research in this field. This study aimed to visualize the research evolution and emerging research hotspots in the field of rectal cancer radiotherapy using bibliometric methods. Data were collected from the Web of Science Core Collection database, including countries, institutions, authors, keywords, and co-citations of references, and the CiteSpace software was used for bibliometric analysis. A total of 5,372 publications on radiotherapy for rectal cancer, published between January 2000 and January 2022, were included. An increasing trend in the number of published articles was observed. There is an overall upward trend in the number of publications published, with the US publishing the most in this field, followed by China and the Netherlands. Italian writer Vincenzo Valentini and German writer R. Sauer ranked first in terms of published articles and co-cited authors, respectively. Literature co-citation and keyword co-occurrence analyses showed that early studies focused on topics such as preoperative radiotherapy, combined radiotherapy and chemotherapy, and total mesorectal excision. In recent years, gradually increasing attention has been paid to short-course radiotherapy, x-ray brachytherapy, and stereotactic systemic radiotherapy. Burst analysis suggested that magnetic resonance (MR)-guided neoadjuvant radiotherapy studies, mechanistic studies, and clinical trials may emerge as new research hotspots. Rectal cancer radiotherapy has been widely studied and the research hotspots have considerably changed in recent years. Future research hotspots may include MR-guided neoadjuvant radiotherapy studies, mechanistic studies, and clinical trials.
Article
Full-text available
The constantly growing body of scholarly knowledge of science, technology, and humanities is an asset of the mankind. While new discoveries expand the existing knowledge, they may simultaneously render some of it obsolete. It is crucial for scientists and other stakeholders to keep their knowledge up to date. Policy makers, decision makers, and the general public also need an efficient communication of scientific knowledge. Several grand challenges concerning the creation, adaptation, and diffusion of scholarly knowledge, and advance quantitative and qualitative approaches to the study of scholarly knowledge are identified.
Article
Full-text available
Information visualization and data visualization are often viewed as similar, but distinct domains, and they have drawn an increasingly broad range of interest from diverse sectors of academia and industry. This study systematically analyzes and compares the intellectual landscapes of the two domains between 2000 and 2014. The present study is based on bibliographic records retrieved from the Web of Science. Using a topic search and a citation expansion, we collected two sets of data in each domain. Then, we identified emerging trends and recent developments in information visualization and data visualization, captivated in intellectual landscapes, landmark articles, bursting keywords, and citation trends of the domains. We found out that both domains have computer engineering and applications as their shared grounds. Our study reveals that information visualization and data visualization have scrutinized algorithmic concepts underlying the domains in their early years. Successive literature citing the datasets focuses on applying information and data visualization techniques to biomedical research. Recent thematic trends in the fields reflect that they are also diverging from each other. In data visualization, emerging topics and new developments cover dimensionality reduction and applications of visual techniques to genomics. Information visualization research is scrutinizing cognitive and theoretical aspects. In conclusion, information visualization and data visualization have co-evolved. At the same time, both fields are distinctively developing with their own scientific interests.
Article
Full-text available
Context: Software product line engineering has proven to be an efficient paradigm to developing families of similar software systems at lower costs, in shorter time, and with higher quality. Objective: This paper analyzes the literature on product lines from 1995 to 2014, identifying the most influential publications, the most researched topics, and how the interest in those topics has evolved along the way. Method: Bibliographic data have been gathered from ISI Web of Science and Scopus. The data have been examined using two prominent bibliometric approaches: science mapping and performance analysis. Results: According to the study carried out, (i) software architecture was the initial motor of research in SPL; (ii) work on systematic software reuse has been essential for the development of the area; and (iii) feature modeling has been the most important topic for the last fifteen years, having the best evolution behavior in terms of number of published papers and received citations. Conclusion: Science mapping has been used to identify the main researched topics, the evolution of the interest in those topics and the relationships among topics. Performance analysis has been used to recognize the most influential papers, the journals and conferences that have published most papers, how numerous is the literature on product lines and what is its distribution over time.
Conference Paper
Computer-based visualization (vis) systems provide visual representations of datasets designed to help people carry out tasks more effectively. Visualization is suitable when there is a need to augment human capabilities rather than replace people with computational decision-making methods. The design space of possible vis idioms is huge, and includes the considerations of both how to create and how to interact with visual representations. Vis design is full of trade-offs, and most possibilities in the design space are ineffective for a particular task, so validating the effectiveness of a design is both necessary and difficult. Vis designers must take into account three very different kinds of resource limitations: those of computers, of humans, and of displays. Vis usage can be analyzed in terms of why the user needs it, what data is shown, and how the idiom is designed. I will discuss this framework for analyzing the design of visualization systems.
Article
This study presents an extensive domain analysis of a discipline - information science - in terms of its authors. Names of those most frequently cited in 12 key journals from 1972 through 1995 were retrieved from Social Scisearch via DIALOG. The top 120 were submitted to author co-citation analyses, yielding automatic classifications relevant to histories of the field. Tables and graphics reveal: (1) The disciplinary and institutional affiliations of contributors to information science; (2) the specialty structure of the discipline over 24 years; (3) authors' memberships in 1 or more specialties; (4) inertia and change in authors' positions on 2-dimensional subject maps over 3 8-year subperiods, 1972-1979, 1980-1987, 1988-1995; (5) the 2 major subdisciplines of information science and their evolving memberships; (6) "canonical" authors who are in the top 100 in all three subperiods; (7) changes in authors' eminence and influence over the subperiods, as shown by mean co-citation counts; (8) authors with marked changes in their mapped positions over the subperiods; (9) the axes on which authors are mapped, with interpretations; (10) evidence of a paradigm shift in information science in the 1980s; and (11) evidence on the general nature and state of integration of information science. Statistical routines include ALSCAL, INDSCAL, factor analysis, and cluster analysis with SPSS; maps and other graphics were made with DeltaGraph. Theory and methodology are sufficiently detailed to be usable by other researchers.