Conference PaperPDF Available

Abstract and Figures

There is an increasing number of scientific articles being published, which makes tracking the state of the art more time-consuming. There are software tools available to help with systematic mapping studies in a field of science, but most of these tools are closed source and involve several manual time-consuming steps that could be automated further. We present an open solution as a cloud-based design for bibliographic analysis that makes the research method available for a wider audience.
Content may be subject to copyright.
This is a pre-print version of an article. The actual version will be published in ACM DL at
http://dx.doi.org/10.1145/2812428.2812442 . Please use the official version of the paper and publication
reference when citing: Antti Knutas, Arash Hajikhani, Juho Salminen, Jouni Ikonen, and Jari Porras. 2015.
Cloud-based bibliometric analysis service for systematic mapping studies. In Proceedings of the 16th
International Conference on Computer Systems and Technologies (CompSysTech '15). ACM, New York,
NY, USA, 184-191. DOI=http://dx.doi.org/10.1145/2812428.2812442
Copyright © 2015 by the Association for Computing Machinery, Inc. (ACM). Permission to make digital or hard copies of portions of this
work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial
advantage and that copies bear this notice and the full citation on the first page in print or the first screen in digital media. Copyrights
for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted.
To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Send written
requests for republication to ACM Publications, Copyright & Permissions at the address above or fax +1 (212) 869-0481 or
email permissions@acm.org.
For other copying of articles that carry a code at the bottom of the first or last page, copying is permitted provided that the per-copy fee
indicated in the code is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.
Cloud-Based Bibliometric Analysis Service for Systematic Mapping
Studies
Antti Knutas, Arash Hajikhani, Juho Salminen, Jouni Ikonen, Jari Porras
Abstract: There is an increasing number of scientific articles being published, which makes tracking
the state of the art more time-consuming. There are software tools available to help with systematic mapping
studies in a field of science, but most of these tools are closed source and involve several manual time-
consuming steps that could be automated further. We present an open solution as a cloud-based design for
bibliographic analysis that makes the research method available for a wider audience.
Key words: Systematic mapping study, literature review, bibliometric analysis, citation analysis, social
network analysis, R, SaaS, cloud computing
INTRODUCTION
A systematic mapping study (SMS) classifies and structures a field of interest in
research by categorizing publications and analyzing their publication trends [17].
Additionally, SMS can provide a sufficient terminology to facilitate the overall analysis of
studies in the field, and represents the research methods and outcomes [3]. There is an
increasing number of papers being published in any given field, which makes it difficult to
give as much attention to individual research papers as previously [15]. At the same time
systematic mapping studies are becoming an increasingly important method of getting an
overview about the state of a field of science, also in software engineering and education
[7, 11]. This means that these mapping studies, which involve several manual processing
and analysis steps are becoming increasingly time-consuming. A mapping study involves
at a minimum a systematic search from scientific databases, archival, sorting, manual
overview of articles by reading, recording selected metadata and content information,
categorizing articles and writing a summary from each selected article.
For example, manually reviewing all the 990 articles in CompSysTech 2000 - 2014
conference proceedings would take a researcher three and half days if reviewing one
article took only five minutes. This problem has been recognized by several research
communities and several tools have been developed that assist in statistical and citation
network analysis [6, 8]. However, these tools still require setup and expert knowledge in
data preparation and processing, and they are closed systems. The current tools available
are not extensible, meaning that each analysis problem a researcher faces has to be re-
implemented from the ground up. We propose that open, extensible tools with even more
automated workflows will make this bibliographic analysis available to a wider part of the
community of researchers and enables more people to get statistical insight into their
scientific database search results. As a way to test and demonstrate our proposal we
implemented an extensible web-based interface for our literature analysis program.
International Conference on Computer Systems and Technologies - CompSysTech15
This is a pre-print version of an article. The actual version will be published in ACM DL at http://dl.acm.org/event.cfm?id=RE248 by late
2015. Please use the official version of the paper and publication reference when citing: Knutas, A., Hajikhani, A., Salminen, J., Ikonen,
J., Porras, J., 2015. Cloud-Based Bibliometric Analysis Service for Systematic Mapping Studies. CompSysTech 2015.
We present the literature analysis tool NAILS, which uses a series of custom
statistical and network analysis functions to give the user an overview of literature
datasets. The features can be divided into two primary sections: Firstly, statistical analysis,
which for example gives an overview of publication frequencies, most published authors
and journals. Secondly, the more novel network analysis, which gives further insight into
relationship between the interlinked citations and cooperation between authors. For
example, the most basic features can use citation network analysis identify the most cited
authors and publication forums. Advanced features support mapping researcher
cooperation and citation networks, and finding the core publications in the examined field
of science. The tool’s source code is freely available in Github, an open source code
repository, and the web-based interface can also be accessed from the project page
(http://aknutas.github.io/nails/).
Our research question in this study is as follows: How can the bibliographic analysis
process for systematic mapping studies be made more straightforward and accessible for
researchers? When accomplishing this task, we also investigate as a subquestion which
kind of automatic analysis results would provide additional value to researchers.
The next section of the paper reviews publications on systematic mapping studies,
analysis tools and the possibilities of network analysis in bibliometric studies. The second
to last section introduces the analysis software design, features and the utility of analysis
results. The paper ends with the discussion and conclusion section.
RELATED WORK ON SYSTEMATIC MAPPING STUDIES
A systematic mapping study (SMS) is a secondary study that aims at classification
and thematic analysis of earlier research [11, 17]. It is closely related to a wider secondary
study, a systematic literature review (SLR), which aims at gathering and evaluating all the
research results on a selected research topic [2, 10]. Kitchenham and Charters [11]
present the best practices of both for the field of software engineering and also compare
the two. The SMS is more general in search terms and aims at classifying and structuring
the field of research, while the target of SLR is to summarize and evaluate the research
results. Kitchenham and Charters [11] also discuss the applications and states where SMS
can be especially suitable if few literature reviews have been done on the topic and there
is a need to get a general overview of the field of interest. Both kinds of studies can be
used to identify research gaps in the current state of research.
While less deep in analysis than a full systematic literature review, a systematic
mapping study includes the following steps [17]: 1) Systematically determining the search
terms and databases. 2) Performing test searches to validate the search terms. 3)
Running a full search and storing the results. 4) Deduplication, sorting and application of
inclusion and exclusion criteria. 5) Fully reviewing the papers according to established
criteria.
The challenge of analyzing author and citation interactions can be approached with
social network analysis. Social network analysis (SNA) is an interdisciplinary technique for
the analysis of social networks [14], where social relationships are viewed in the terms of
network theory. In social network analysis communication between individual or social
units are mapped into a communication matrix and then visualized in graphs. In graph
theory there are different mathematical tools available, which can be used to for example
estimate the relative influence of nodes in the graph or analyze the graph by the nodes’
connection patterns [1, 12]. In scientific bibliometric analysis the communication patterns,
or citations, of different authors can be analyzed by modeling publications or authors as
nodes and mapping citations or co-publications as node edges. The method of social
network analysis for bibliometric studies has been applied for citation network visualization
[9], and for detecting and analyzing co-authorship networks [16].
International Conference on Computer Systems and Technologies - CompSysTech15
This is a pre-print version of an article. The actual version will be published in ACM DL at http://dl.acm.org/event.cfm?id=RE248 by late
2015. Please use the official version of the paper and publication reference when citing: Knutas, A., Hajikhani, A., Salminen, J., Ikonen,
J., Porras, J., 2015. Cloud-Based Bibliometric Analysis Service for Systematic Mapping Studies. CompSysTech 2015.
AUTOMATING SYSTEMATIC MAPPING STUDIES WITH NAILS
The literature analysis system presented in this paper consists of two major parts: the
series of analysis scripts, implemented in R, and the web-based batch processing
interface. The web interface is responsible for queuing up jobs and serving results, and the
R analysis component performs the statistical and network analysis. The server uses an
extensible plug-in architecture and the open analysis components can be modified to have
new features or new plugins can be added to the system.
The two-part design of the analysis system and the separation of components are
intended to make software design for different environments easier. This way the open
source R functions can for example be included in other desktop analysis software
packages and similarly more analysis components can be added to server without having
to recompile the server software itself.
The entire process can be deployed to a group of dynamically scaling cloud
instances. The analysis process is initiated by a user from the front-end web server, which
keeps track of input files and queues up job requests into a relational database. The queue
of job requests is then batch processed by a separate analysis program, which can be on
a single server in low traffic situations, or on several different server instances when
necessary. After the analysis is complete, the analysis program uploads the results to
storage and updates the database with completed job status. The request process
between different server components is illustrated in the Figure 1.
Figure 1. Traversal of a job request through the system components
International Conference on Computer Systems and Technologies - CompSysTech15
This is a pre-print version of an article. The actual version will be published in ACM DL at http://dl.acm.org/event.cfm?id=RE248 by late
2015. Please use the official version of the paper and publication reference when citing: Knutas, A., Hajikhani, A., Salminen, J., Ikonen,
J., Porras, J., 2015. Cloud-Based Bibliometric Analysis Service for Systematic Mapping Studies. CompSysTech 2015.
The separation of components might seem excessive at a first glance, but the
software components can be run on a single machine at low traffic situations. Conversely
the components can be deployed on separate servers with multiple simultaneous program
instances if necessary and cloud storage service with content distribution services be used
for file storage.
Functionality and Services
The system works on publication records available for download from Thomson
Reuters Web of Science Core Collection. The system analyses seven essential variables
from each publication, which include the authors, keywords, publication forum, article type
and cited articles. The user downloads the literature data from Web of Science and
uploads it to the analysis system via a web interface. The system then removes duplicate
records and performs an exploratory data analysis on provided literature data. The
analysis identifies for instance the most cited articles and authors, most common
keywords, and journals with most publications. These statistics are accompanied with
visualizations for a quick data overview. Additionally the system extracts the citation
network data from the literature. Having an access to a citation network enables
calculating how many times each reference has been cited by a paper inside the analyzed
dataset. This feature is useful for identifying influential sources within a field of science,
especially because it finds often-cited papers and books not listed in the primary literature
dataset.
In addition to providing an exploratory analysis report, the system extracts and
exports data about citation and author cooperation networks that can be visualized e.g.
with the Gephi [4] open graph visualization platform. This dataset of citation connections
can be used to calculate the relative influence of publications in the network, for example
using the eigenvector centrality analysis. Eigenvector is a measure of node centrality,
which can be applied to identify nodes that play central roles in the network structure. It
can be seen as a weighted sum of not only direct connections, but indirect connections of
every length [5]. Compared to simpler geometrical measures like degree centrality (i.e.
total number of citations), eigenvector centrality also considers the influence of the
connected nodes and takes into account the entire pattern of the graph. Where degree
centrality gives a simple count of number of connections a node has, eigenvector centrality
assigns higher values to connections to higher-ranking nodes [13]. For example, with this
calculation method a node with few high-ranking connections might outrank a node with a
larger number of low-ranking connections.
Analysis Case Study
For the purpose of demonstration a sample dataset was retrieved from the Thomson
Reuters Web of Science with the search term of “computer supported collaborative
learning” and year limit of 1990-2015. 1806 records were stored and processed with the
analysis system. The processing and output rendering phase took 32 seconds on a dual
core 2GHz Xeon test server. The entire analysis process from database search, analysis
server upload and result download took three and half minutes.
The keyword summary section from the exploratory data analysis is displayed in the
Figure 2 as a sample feature. It allows one to get an overview what are other common
research topics in the dataset. In the sample dataset it can be seen that distance
education and higher learning are current research topics in computer-supported
collaborative learning.
International Conference on Computer Systems and Technologies - CompSysTech15
This is a pre-print version of an article. The actual version will be published in ACM DL at http://dl.acm.org/event.cfm?id=RE248 by late
2015. Please use the official version of the paper and publication reference when citing: Knutas, A., Hajikhani, A., Salminen, J., Ikonen,
J., Porras, J., 2015. Cloud-Based Bibliometric Analysis Service for Systematic Mapping Studies. CompSysTech 2015.
Figure 2. Most commonly occurring keywords in the example dataset
Another result from exploratory data analysis is the publication citation counts. An
example result for citation counts from the sample dataset is displayed in the Table 1.
Another measure of relevance is centrality values, displayed in the rightmost column
(eigenvector), which discussed further in the next paragraph.
Table 1. Four most cited articles in the example dataset
Author
Year
Identifier
Citations
Eigenvector
KREIJNS, K.
2003
10.1016/S0747-
5632(02)00057-2
274
0,622
ALAVI, M.
1994
10.2307/249763
238
0,005
STAHL, G.
2006
V, P409
220
0,243
DE WEVER,
WB.
2006
10.1016/J.COMPED
U.2005.04.005
165
0,284
Additionally, we applied an influence analysis to the network using eigenvector
centrality measure. In the Figure 3 we present a visualization of the network analysis
results using the Gephi [4] visualization software. In the graph the size and shade denote
node, or article, influence, with darkest and largest nodes being the most influential.
Because of size limitations we display only the 250 most influential nodes according to the
eigenvector centrality analysis results. Additionally, we marked the most referred article in
International Conference on Computer Systems and Technologies - CompSysTech15
This is a pre-print version of an article. The actual version will be published in ACM DL at http://dl.acm.org/event.cfm?id=RE248 by late
2015. Please use the official version of the paper and publication reference when citing: Knutas, A., Hajikhani, A., Salminen, J., Ikonen,
J., Porras, J., 2015. Cloud-Based Bibliometric Analysis Service for Systematic Mapping Studies. CompSysTech 2015.
black (node D) and the three most central articles with white and light gray (nodes A, B,
C). In the figure the benefits of centrality analysis become apparent. The literature review
article by Kreijns (node D) has most citations, but is not as central to the field of science as
for example the three other nodes, which discuss fundamental issues of CSCL and are
commonly cited by other influential articles in the dataset.
The value of centrality analysis is highlighted by the results from the sample dataset.
Basic citation count would not have highlighted the fundamental articles, because the
citations are more diffused among several valued papers, but literature review articles are
more rare and often cited, despite not bibliographically interacting as much with the field of
science.
Figure 3. Visualization of a social network analysis results from the example dataset
DISCUSSION AND CONCLUSION
The exploratory analysis figures and network analysis results can help a researcher
to get a quick visual overview and then deeper insight into the investigated field of science.
By having the data automatically analyzed and visualized, the researcher can identify the
International Conference on Computer Systems and Technologies - CompSysTech15
This is a pre-print version of an article. The actual version will be published in ACM DL at http://dl.acm.org/event.cfm?id=RE248 by late
2015. Please use the official version of the paper and publication reference when citing: Knutas, A., Hajikhani, A., Salminen, J., Ikonen,
J., Porras, J., 2015. Cloud-Based Bibliometric Analysis Service for Systematic Mapping Studies. CompSysTech 2015.
core publications, publication trends, common research themes and the direction of latest
research. With centrality analysis the core publications can be identified more reliably than
by measuring the total number of citations, because it takes into account citation weighing
and considers the centrality of citing articles.
In multidisciplinary fields network analysis enables identifying the interplay of citations
and the contributions from other disciplines. This enables the researcher to see how the
multidisciplinary nature of the field has formed and from which papers. Another opportunity
that emerges by adding the dimension of time to the network, which visualizes the
evolution of the publication network. This feature illustrates how the literature in a given
field of science came to exist over time and allows one to identify publication forums and
influential publications at different periods.
The research question of this paper was how to make systematic mapping studies
more straightforward and accessible for researchers. We presented an open, extensible
cloud-based literature analysis architecture as a solution and an implementation of that
architecture. The presented tool, NAILS, allows the user to get a statistical and network
overview of bibliographical datasets by uploading it to the cloud-based analysis service.
The service uses an open source, extensible plugin architecture, which can serve as a
platform for researchers who implement additional analysis features.
The sample implementation is a basic version and could benefit from additional
features. The largest limitation is that data import works now only with Web of Science
input data and thus cannot process articles not included in that database. The second
limitation is that while the system analyses both statistical and citation network data, at the
moment it visualizes only statistical data, requiring an user-installed software package for
network visualization and the display of centrality values. Future work will include adding
these features using the open plugin architecture and including additional major data
sources, like Scopus. Additionally, having automatic import tools as browser plugins for
different scientific datasets would make initiating analysis jobs even easier for researchers,
but automatic downloads would also involve complicated copyright issues.
REFERENCES
[1] Abraham, A. and Hassanien, A.E. 2010. Computational Social Network
Analysis: Trends, Tools and Research Advances. Springer.
[2] De Almeida Biolchini, J.C. et al. 2007. Scientific research ontology to support
systematic review in software engineering. Advanced Engineering Informatics. 21, 2 (Apr.
2007), 133151.
[3] Bailey, J. et al. 2007. Evidence relating to Object-Oriented software design:
A survey. ESEM (2007), 482484.
[4] Bastian, M. et al. 2009. Gephi: An open source software for exploring and
manipulating networks. International AAAI conference on weblogs and social media
(2009).
[5] Bonacich, P. 2007. Some unique properties of eigenvector centrality. Social
Networks. 29, 4 (Oct. 2007), 555564.
[6] Börner, K. 2011. Science of Science Studies: Sci2 Tool. Communications of
the ACM. 54, 3 (2011), 6069.
[7] Borrego, M. et al. 2014. Systematic Literature Reviews in Engineering
Education and Other Developing Interdisciplinary Fields. Journal of Engineering
Education. 103, 1 (Jan. 2014), 4576.
[8] Van Eck, N.J. and Waltman, L. 2014. CitNetExplorer: A new software tool for
analyzing and visualizing citation networks. Journal of Informetrics. 8, 4 (Oct. 2014), 802
823.
International Conference on Computer Systems and Technologies - CompSysTech15
This is a pre-print version of an article. The actual version will be published in ACM DL at http://dl.acm.org/event.cfm?id=RE248 by late
2015. Please use the official version of the paper and publication reference when citing: Knutas, A., Hajikhani, A., Salminen, J., Ikonen,
J., Porras, J., 2015. Cloud-Based Bibliometric Analysis Service for Systematic Mapping Studies. CompSysTech 2015.
[9] Van Eck, N.J. and Waltman, L. 2014. Visualizing bibliometric networks.
Measuring Scholarly Impact. Springer. 285320.
[10] Kitchenham, B. et al. 2009. Systematic literature reviews in software
engineering A systematic literature review. Information and Software Technology. 51, 1
(Jan. 2009), 715.
[11] Kitchenham, B.A. and Charters, S. 2007. Guidelines for performing
systematic literature reviews in software engineering. Technical Report #EBSE-2007-01.
Department of Computer Science, University of Durham.
[12] Knoke, D. et al. 2008. Social network analysis. Sage Publications Los
Angeles, CA.
[13] Newman, M.E. 2008. The mathematics of networks. The new palgrave
encyclopedia of economics. 2, (2008), 112.
[14] Otte, E. and Rousseau, R. 2002. Social network analysis: a powerful
strategy, also for the information sciences. Journal of Information Science. 28, 6 (Dec.
2002), 441453.
[15] Parolo, B. et al. 2015. Attention decay in science. Available at SSRN
2575225. (2015).
[16] Perianes-Rodríguez, A. et al. 2010. Detecting, identifying and visualizing
research groups in co-authorship networks. Scientometrics. 82, 2 (2010), 307319.
[17] Petersen, K. et al. 2008. Systematic mapping studies in software
engineering. 12th International Conference on Evaluation and Assessment in Software
Engineering (2008), 1.
ABOUT THE AUTHORS
Antti Knutas, M.Sc., Lappeenranta University of Technology, Phone: +358-0294-462-
111, Е-mail: antti.knutas@lut.fi.
Arash Hajikhani, M.Sc., Lappeenranta University of Technology, Phone: +358-0294-
462-111, Е-mail: arash.hajikhani@lut.fi.
Juho Salminen, M.Sc., Lappeenranta University of Technology, Phone: +358-0294-
462-111, Е-mail: juho.salminen@lut.fi.
Associate Professor Jouni Ikonen, D.Sc., Lappeenranta University of Technology,
Phone: +358-0294-462-111, Е-mail: jouni.ikonen@lut.fi.
Professor Jari Porras, D.Sc., Lappeenranta University of Technology, Phone: +358-
0294-462-111, Е-mail: jari.porras@lut.fi.
... To the best of our knowledge, there is a lack of such tools. Either tools require statistical knowledge and programming skills [4], or they focus on certain bibliometric analysis techniques-and thus do not provide comprehensive functionalities [5][6][7], they do not support a wide range of data providers [8], they do not cover a wide range of academic fields [9], or they do not much consider usability aspects [10]. ...
Article
Full-text available
A deep understanding about a field of research is valuable for academic researchers. In addition to technical knowledge, this includes knowledge about subareas, open research questions, and social communities (networks) of individuals and organizations within a given field. With bibliometric analyses, researchers can acquire quantitatively valuable knowledge about a research area by using bibliographic information on academic publications provided by bibliographic data providers. Bibliometric analyses include the calculation of bibliometric networks to describe affiliations or similarities of bibliometric entities (e.g., authors) and group them into clusters representing subareas or communities. Calculating and visualizing bibliometric networks is a nontrivial and time-consuming data science task that requires highly skilled individuals. In addition to domain knowledge, researchers must often provide statistical knowledge and programming skills or use software tools having limited functionality and usability. In this paper, we present the ambalytics bibliometric platform, which reduces the complexity of bibliometric network analysis and the visualization of results. It accompanies users through the process of bibliometric analysis and eliminates the need for individuals to have programming skills and statistical knowledge, while preserving advanced functionality, such as algorithm parameterization, for experts. As a proof-of-concept, and as an example of bibliometric analyses outcomes, the calculation of research fronts networks based on a hybrid similarity approach is shown. Being designed to scale, ambalytics makes use of distributed systems concepts and technologies. It is based on the microservice architecture concept and uses the Kubernetes framework for orchestration. This paper presents the initial building block of a comprehensive bibliometric analysis platform called ambalytics, which aims at a high usability for users as well as scalability.
... In order to analyse scholarly outputs in the area of the research, a web-based software called HAMMER was used. HAMMER is a web-based server for automating a network analysis for literature study scripts [44]. In the quality analysis of the documents, the top 100 documents with the most citation per year were investigated. ...
Preprint
Full-text available
This bibliometric study investigated the public trends in the fields of nanoparticles which is limited to drug delivery and magnetic nanoparticles’ literature published from 1980 to October 2017. The data were collected from the Web of Science Core Collections, and a network analysis of research outputs was carried out to analyse the research trends in the nanoparticles literature. Nanoparticles and its applications are progressing in recent years. The results show that documents in the field of nanoparticles in chemistry and material science have improved in citation rate, as the authors were researching in multidisciplinary zones. Top-cited documents are mainly focusing on drug delivery, magnetic nanoparticles and iron oxide nanoparticles which are also the top research keywords in all papers published. Top-cited papers are mostly published in Biomaterials journal which so far has published 12% of top-cited articles. Although research areas such as contrast agents, quantum dots, and nanocrystals are not considered as the top-ranked keywords in all documents, these keywords received noticeable citations. The trends of publications on drug delivery and magnetic nanoparticles give a general view on future research and identify potential opportunities and challenges.
... Nesse sentido, será fornecida uma visão geral da literatura produzida sobre o empreendedorismo digital, baseado na exploração de redes bibliométricas, que propiciam a classificação e a estruturação do campo de conhecimento baseado em representações gráficas geradas a partir das medidas de relações entre unidades bibliométricas (periódicos, autores, palavras-chave etc.) (Knutas, Hajikhani, Salminen, Ikonen, & Porras, 2015). ...
Article
Full-text available
O objetivo desta pesquisa foi mapear a produção científica internacional sobre empreendedorismo digital. Para alcançar esse propósito, foi conduzida uma pesquisa do tipo exploratória, descritiva e bibliométrica, com abordagem quantitativa. A mensuração da produção científica foi baseada nas três leis clássicas da bibliometria: Lei de Bradford, Lei de Zipf e Lei de Lotka. O estudo seguiu as etapas definidas por Sousa, Fontenele, Silva e Filho (2019) para a condução de pesquisas bibliométricas, a saber: seleção da base de dados e do software bibliométrico, definição da amostra, levantamento da amostra, tabulação e tratamento de dados descritivos, e análise dos dados. Os dados foram coletados do banco de dados da Web Of Science e tabulados no software Excel. Para analisar as redes bibliométricas de coocorrência de palavras-chave e de citação (autores e periódicos) utilizou-se o software VOS Viewer. Este estudo mostra de forma geral as principais correntes temáticas exploradas em cenário internacional sobre o empreendedorismo digital: Transformação Digital, Estratégia, Data Science e Inovação Digital. Também sinaliza outros aspectos relacionados ao construto, oferecendo um respaldo teórico para outras pesquisas, sobretudo para ampliar o campo do conhecimento sobre o assunto analisado.
... A maioria dos estudos bibliométricos foi realizada com base na Web of Science(ESTOQUE ET AL., 2019;LIU, 2013). Neste estudo foi utilizado o aplicativo Network Analysis Interface for Literature Studies (NAILS) para a execução analítica e bibliométrica, o NAILS foi idealizado porKnutas et al. (2015). O NAILS trabalha de forma integrada com uma das maiores e mais consolidadas redes de periódicos de alto impacto do mundo.A Web of Science (WoS) (SALMINEN ET AL.,2019), além de ser facilmente acoplado a outros softwares de análise visuais, fornecendo a pesquisadores respaldo estatístico e acadêmico em suas publicações onde é utilizado o software R (FREIRE-SILVA ET AL., 2019). ...
Article
Full-text available
Este estudo é fruto de investigação desenvolvidos no estudo de doutorado em desenvolvimento e meio ambiente do autor principal, que tem como um dos temas do seu objeto de pesquisa os serviços ecossistêmicos. O estudo teve como objetivo identificar tendências globais de pesquisa envolvendo artigos altamente citados sobre Serviços Ecossistêmicos (SE) do ano 2000 até o ano 2020. A pesquisa é explicativa com abordagem quantitativa utilizando a análise bibliométrica a partir de levantamento realizado na base de dados NetworkAnalysis Interface for Literature Studies (NAILS) como fonte de artigos publicados de 2000 a 2020, correspondendo aos quatro termos de busca. Os resultados permeiam o panorama geral das pesquisas sobre SE. Apresentando pela ordem, os periódicos que mais foram citados correspondem a (1); autores mais citados em estudos sobre os SE (2); amostra do ano de maior produção de estudos sobre SE (3); aplicação por localização, área científica e escolha do SE (4). Alguns pontos levantados, mostra que a totalidade dos estudos dos SE, tem caráter conservacionista, outro apontamento, mostra que muitos estudos sobre os SE têm caráter econômico/monetário, buscando o uso sustentável dos ambientes.
... The top 10 most important papers from dataset1 were identified by uploading the bibliometric information into NAILS (Network Analysis Interface for Literature Studies), an open source Social Network Analysis project (Knutas et al., 2015). Analytical tools were accessed from a GitHub repository and data processed in RStudio. ...
Thesis
Rising levels of anthropogenic underwater sound may have negative consequences on freshwater ecosystems. Additionally, the biological relevance of sound to fish and observed responses to human-generated noise promote the use of acoustics in behavioural guidance technologies that are deployed to control the movement of fish. For instance, acoustic stimuli may be used to prevent the spread of invasive fishes or facilitate the passage of vulnerable native species at man-made obstructions. However, a strong understanding of fish response to acoustics is needed for it to be effectively deployed as a fisheries management tool, but such information is lacking. Therefore, this thesis investigated the group behavioural responses of cyprinids to acoustic stimuli. A quantitative meta-analysis and experimental studies conducted in a small-tank or large open-channel flume were used to address key knowledge gaps that are necessary to improve the sustainability of acoustic deterrent technologies, and assist in conservation efforts to reduce the negative impacts of anthropogenic noise. Current understanding on the impact of anthropogenic noise on fishes (marine, freshwater and euryhaline species) was quantified. The impact of man-made sound is greatest for fish experiencing anatomical damage, for adult and juveniles compared to earlier life-stages, and for fish occupying freshwater environments. These findings suggest a review of the current legislation covering aquatic noise mitigation which commonly focus on marine-centric strategies, thereby undervaluing the susceptibility of freshwater fish to the rising levels of anthropogenic sound. Limitations and knowledge gaps within the literature were also identified, including: 1) group behavioural responses to sound, 2) the response of fish to different fundamental acoustic properties of sound, 3) system longevity (e.g. habituation to a repeated sound exposure), and 4) site-specific constraints. Fish movement and space use were quantified using fine-scale behavioural metrics (e.g. swimming speed, shoal distribution, cohesion, orientation, rate of tolerance and signal detection theory) and their collective response to acoustics assessed using two approaches. First, a still-water small tank set-up allowed for the careful control of confounding factors while investigating cyprinid group response to fundamental acoustic properties of sound (e.g. complexity, pulse repetition rate, signal-to-noise ratio). Second, a large open-channel flume enabled the ability of a shoal to detect and respond to acoustic signals to be quantified under different water velocities. Shoals of European minnow (Phoxinus phoxinus), common carp (Cyprinus carpio) and roach (Rutilus rutilus) altered their swimming behaviour (e.g. increased group cohesion) in response to a simple low frequency tonal stimulus. The pulse repetition rate of a signal was observed to influence the long-term behavioural recovery of minnow to an acoustic stimulus. Furthermore, signal detection theory was deployed to quantify the impact of background masking noise on the group behavioural response of carp to a tonal stimulus, and investigate how higher water velocities commonly experienced by fish in the wild may influence the response of roach to an acoustic stimulus. Fine-scale behavioural responses were observed the higher the signal-to-noise ratio, and discriminability of an acoustic signal and the efficacy at which fish were deterred from an insonified channel was greatest under higher water velocities. The information presented in this thesis significantly enhances our understanding of fish group responses to man-made underwater sound, and has direct applications in freshwater conservation, fish passage and invasive species management.<br/
... The documents were also sorted into topics using the latent Dirichlet allocation (LDA) algorithm [31] using a modified version of the NAILS script [32], which utilizes the topicmodels R package [33], and visualized with the LDAvis library [34]. LDA can be used as a statistical method text mining method for assigning documents into topics, which are detected using word association and distributions [35]. ...
Article
Full-text available
The systematic mapping study (SMS) is a relatively new method of generating new information from existing studies. First defined as a methodology in 2007, it offers a method to filter existing information to produce novel insight into the observed research domain, and pinpoint new directions of research. In this study, the systematic mapping study method was utilized to determine how SMS as a method has spread and was utilized during the first decade since its conceptualization. In general, it was found that the SMS method is still at its early phase in utilization, and is mainly used in software engineering and healthcare studies, but also in several other scientific domains. SMS research and the scientific outputs rely on transparent protocols when conducting the actual search and identification process, and so far, the applied protocol and research procedure correlates strongly with the application domain; different domains have their own protocols. The SMS method can be recommended, for example, when the aim is to gain knowledge on how a specific topic is studied and where there are research gaps. There are still areas that are debated or where successful implementation is difficult, the biggest problems being the amount of work it requires and possible lack of quality analysis of the articles.
... The exported files were analysed with the Network Analysis Interface for Literature Studies (NAILS cf. Knutas et al., 2015) in order to produce node and edge files of the bibliometric network. These files could then be imported into the open source network visualisation software Gephi (Bastian, Heymann & Jacomy, 2009), which was then used to filter the network to identify the specific research communities for the analysis. ...
Article
Full-text available
For a research article (RA) to be accepted, not only for publication, but also by its readers, it must display proficiency in the content, methodologies and discourse conventions of its specific discipline. While numerous studies have investigated the linguistic characteristics of different research disciplines, none have utilised Social Network Analysis techniques to identify communities prior to analysing their language use. This study aims to investigate the language use of three highly specific research communities in the fields of Psychology, Physics and Sports Medicine. We were interested in how these language features are related to the total number of citations, the eigencentrality within the community and the intra-network citations of the individual RAs. Applying Biber’s Multidimensional Analysis approach, a total of 771 RA abstracts published between 2010 and 2019 were analysed. We evaluated correlations between one of three network characteristics (citations, eigencentrality and in-degree), the corpora’s dimensions and 72 individual language features. The pattern of correlations suggest that features cited by other RAs within the discourse community network are in almost all cases different from those that are cited by RAs from outside the network. This finding highlights the challenges of writing for both a discipline-specific and a wider audience.
... Furthermore, reviewing research articles with this method provides an overview of extensive load documents leading readers to understand what the researchers have done or what is missing in our literacy. This technique is accepted in various fields, including sciences and art (Boonroungrut & Toe-Oo, 2017;Knutas, Hajikhani, Salminen, Ikonen, & Porras, 2015). Here this study will describe LD knowledge clusters and discussing the essential LD trend in each category. ...
Chapter
A systematic literature review is a sine qua non for performing significant research in the selected field. However, it is a complex, error-prone and time-consuming process. In order to reduce the probability of errors’ occurrence during the literature review process, as well as to make it easier for researchers to perform this process, tools that support systematic literature review (SLR) are developed. Visualisation is essential for the conducting of SLR. There is a lack of researches about the software tools that could support SLR process by means of visualisation. This paper focuses on identifying visualisation tools that are currently used in practice to support SLR process. In the paper, we present and com- pare four tools in order to provide valuable information on current visualisation trends in the context of SLR, best practices, and preferable characteristics of visualisation tools for SLR.
Article
Full-text available
O trabalho objetiva analisar bibliometricamente os estudos sobre reservatórios no Brasil nas bases Web of Science e Scopus, buscando elucidar os questionamentos secundários: (1) quais os assuntos mais abordados nos artigos sobre o tema; (2) se o aumento de pesquisas está relacionado aos desastres dos reservatórios e (3) qual a situação do desenvolvimento de trabalhos sobre políticas públicas e reservatórios. A partir dos resultados, concluiu-se que (1) os estudos mais citados sobre os reservatórios no Brasil baseiam-se primeiramente nas pesquisas da comunidade planctônica e nos trabalhos acerca da Ictiofauna. Sobre os periódicos que publicam mais sobre o tema, o campo das Ciências Biológicas seguidos da Zoologia e de Aquicultura destacam-se. (2) Seguindo a tendência natural de crescimento dos artigos, os registros de períodos com secas em diferentes regiões do Brasil e os evidentes desastres, é inconclusivo afirmar que a ciência venha a produzir em maior quantidade em decorrência dos desastres que atingem os reservatórios; e (3) Quando comparado a assuntos mais gerais, nota-se uma carência de trabalhos do tema envolvendo políticas hídricas nas bases de dados científicas, o que não diz necessariamente que hajam poucos, mas que muitos não se identificam como parte do escopo das políticas. É essencial que o Brasil incentive os seus pesquisadores a relacionarem o assunto com mais frequência, de modo que contribuam com as tomadas de decisões em diferentes esferas.
Article
Full-text available
Gephi is an open source software for graph and network analysis. It uses a 3D render engine to display large networks in real-time and to speed up the exploration. A flexible and multi-task architecture brings new possibilities to work with complex data sets and produce valuable visual results. We present several key features of Gephi in the context of interactive exploration and interpretation of networks. It provides easy and broad access to network data and allows for spatializing, filtering, navigating, manipulating and clustering. Finally, by presenting dynamic features of Gephi, we highlight key aspects of dynamic network visualization.
Article
Full-text available
The objective of this report is to propose comprehensive guidelines for systematic literature reviews appropriate for software engineering researchers, including PhD students. A systematic literature review is a means of evaluating and interpreting all available research relevant to a particular research question, topic area, or phenomenon of interest. Systematic reviews aim to present a fair evaluation of a research topic by using a trustworthy, rigorous, and auditable methodology. The guidelines presented in this report were derived from three existing guidelines used by medical researchers, two books produced by researchers with social science backgrounds and discussions with researchers from other disciplines who are involved in evidence-based practice. The guidelines have been adapted to reflect the specific problems of software engineering research. The guidelines cover three phases of a systematic literature review: planning the review, conducting the review and reporting the review. They provide a relatively high level description. They do not consider the impact of the research questions on the review procedures, nor do they specify in detail the mechanisms needed to perform meta-analysis.
Article
Full-text available
The exponential growth in the number of scientific papers makes it increasingly difficult for researchers to keep track of all the publications relevant to their work. Consequently, the attention that can be devoted to individual papers, measured by their citation counts, is bound to decay rapidly. In this work we make a thorough study of the life-cycle of papers in different disciplines. Typically, the citation rate of a paper increases up to a few years after its publication, reaches a peak and then decreases rapidly. This decay can be described by an exponential or a power law behavior, as in ultradiffusive processes, with exponential fitting better than power law for the majority of cases. The decay is also becoming faster over the years, signaling that nowadays papers are forgotten more quickly. However, when time is counted in terms of the number of published papers, the rate of decay of citations is fairly independent of the period considered. This indicates that the attention of scholars depends on the number of published items, and not on real time.
Chapter
The patterns of interactions, both economic and otherwise, between individuals, groups or corporations form social networks whose structure can have a substantial effect on economic outcomes. The study of social networks and their implications has a long history in the social sciences and more recently in applied mathematics and related fields. This article reviews the main developments in the area with a focus on practical applications of network mathematics.
Chapter
This chapter provides an introduction to the topic of visualizing bibliometric networks. First, the most commonly studied types of bibliometric networks (i.e., citation, co-citation, bibliographic coupling, keyword co-occurrence, and coauthorship networks) are discussed, and three popular visualization approaches (i.e., distance-based, graph-based, and timeline-based approaches) are distinguished. Next, an overview is given of a number of software tools that can be used for visualizing bibliometric networks. In the second part of the chapter, the focus is specifically on two software tools: VOSviewer and CitNetExplorer. The techniques used by these tools to construct, analyze, and visualize bibliometric networks are discussed. In addition, tutorials are offered that demonstrate in a step-by-step manner how both tools can be used. Finally, the chapter concludes with a discussion of the limitations and the proper use of bibliometric network visualizations and with a summary of some ongoing and future developments.
Article
We present CitNetExplorer, a new software tool for analyzing and visualizing citation networks of scientific publications. CitNetExplorer can for instance be used to study the development of a research field, to delineate the literature on a research topic, and to support literature reviewing. We first introduce the main concepts that need to be understood when working with CitNetExplorer. We then demonstrate CitNetExplorer by using the tool to analyze the scientometric literature and the literature on community detection in networks. Finally, we discuss some technical details on the construction, visualization, and analysis of citation networks in CitNetExplorer.
Article
Background In fields such as medicine, psychology, and education, systematic reviews of the literature critically appraise and summarize research to inform policy and practice. We argue that now is an appropriate time in the development of the field of engineering education to both support systematic reviews and benefit from them. More reviews of prior work conducted more systematically would help advance the field by lowering the barrier for both researchers and practitioners to access the literature, enabling more objective critique of past efforts, identifying gaps, and proposing new directions for research. PurposeThe purpose of this article is to introduce the methodology of systematic reviews to the field of engineering education and to adapt existing resources on systematic reviews to engineering education and other developing interdisciplinary fields. Scope/Method This article is primarily a narrative review of the literature on conducting systematic reviews. Methods are adapted to engineering education and similar developing interdisciplinary fields. To offer concrete, pertinent examples, we also conducted a systematic review of systematic review articles published on engineering education topics since 1990. Fourteen exemplars are presented in this article and used to illustrate systematic review procedures. Conclusions Systematic reviews can benefit the field of engineering education by synthesizing prior work, by better informing practice, and by identifying important new directions for research. Engineering education researchers should consider including systematic reviews in their repertoire of methodologies.