ArticlePDF Available

What Are Journals and Reviewers Concerned About in Data Papers? Evidence From Journal Guidelines and Review Reports

Wiley
Learned Publishing
Authors:

Abstract and Figures

The evolution of data journals and the increase in data papers call for associated peer review, which is intricately linked yet distinct from traditional scientific paper review. This study investigates the data paper review guidelines of 22 scholarly journals that publish data papers and analyses 131 data papers' review reports from the journal Data. Peer review is an essential part of scholarly publishing. Although the 22 data journals employ disparate review models, their review purposes and requirements exhibit similarities. Journal guidelines provide authors and reviewers with comprehensive references for reviewing, which cover the entire life cycle of data. Reviewer attitudes predominantly encompass Suggestion, Inquiry, Criticism and Compliment during the specific review process, focusing on 18 key targets including manuscript writing, diagram presentation, data process and analysis, references and review and so forth. In addition, objective statements and other general opinions are also identified. The findings show the distinctive characteristics of data publication assessment and summarise the main concerns of journals and reviewers regarding the evaluation of data papers.
This content is subject to copyright. Terms and conditions apply.
1 of 14
Learned Publishing, 2025; 38:e2001
https://doi.org/10.1002/leap.2001
Learned Publishing
ORIGINAL ARTICLE OPEN ACCESS
What Are Journals and Reviewers Concerned About in
Data Papers? Evidence From Journal Guidelines and
Review Reports
XinyuWang1 | LeiXu1,2
1School of Infor mation Management, Wuhan University, Wuhan, China | 2Research Institute for Publishing, Wuhan University, Wuhan,China
Correspondence: Lei Xu (2018301040012@whu.edu.cn; xlei@whu.edu.cn)
Received: 6 August 202 4 | Revised: 2 Februar y 2025 | Accepted: 7 February 2025
Funding: This work was supported by National Social Science Fund of China, 22BTQ106.
Keywords: data jour nal| data paper| guideline| peer review
ABSTRACT
The evolution of data journals and the increase in data papers call for associated peer review, which is intricately linked yet
distinct from traditional scientific paper review. This study investigates the data paper review guidelines of 22 scholarly journals
that publish data papers and analyses 131 data papers' review reports from the journal Data. Peer review is an essential part of
scholarly publishing. Although the 22 data journals employ disparate review models, their review purposes and requirements
exhibit similarities. Journal guidelines provide authors and reviewers with comprehensive references for reviewing, which cover
the entire life cycle of data. Reviewer attitudes predominantly encompass Suggestion, Inquiry, Criticism and Compliment during
the specific review process, focusing on 18 key targets including manuscript writing, diagram presentation, data process and
analysis, references and review and so forth. In addition, objective statements and other general opinions are also identified. The
findings show the distinctive characteristics of data publication assessment and summarise the main concerns of journals and
reviewers regarding the evaluation of data papers.
1 | Introduction
The advancement of science has heightened the demand for sci-
entific data, alongside a greater emphasis on transparency and
accessibility. However, conducting experiments or surveys to
generate valuable scientific data entails substantial costs and
complex procedures. Thus, assessing scientific data quality
and its potential applicability is imperative in determining the
prospective utility of research findings (Yardimci et al. 2019).
Data journals play a crucial role in this landscape by regularly
publishing peer reviewed data papers. These papers provide in-
centives for data creators to verify, document, review and dis-
seminate their work (Walters2020). A data paper is a scholarly
journal publication that provides comprehensive and searchable
metadata, with the primary objective of describing a dataset or
a collection of datasets in accordance with standard academic
practices (Chavan and Penev2011; Penev etal.2012). The jour-
ney of a data paper from conception to publication undergoes a
pivotal stage known as peer review, which necessitates a com-
prehensive evaluation not only of the research methodology and
scientific findings but also of the accessibility, quality, and reus-
ability of the research data. W hile the peer review process for tr a-
ditional research papers is well- established, the process for data
papers is still in its early stage and warrants further discussion
(Kong etal.2019). Peer reviewers not only help journal editors
in scrutinising manuscripts for publication but also enhances
the rigour, novelty and value of submitted papers (Garcia- Costa
et al. 2022; Squazzoni et al. 2020). With the development of
open science, open peer review has emerged and brought about
more advantages such as encouraging high- quality reviews,
This is a n open access ar ticle under the terms of t he Creative Commons Attr ibution License, which p ermits use, dis tribution and repro duction in any medium, p rovided the orig inal work is
properly cited.
© 2025 T he Author(s). Learned Publishing published b y John Wiley & Sons Ltd o n behalf of ALP SP.
2 of 14 Learned Publishing, 2025
enhancing transparency and supporting systematic study of
peer review with open review reports (Polka etal.2018).
In addition, datasets and data papers serve as the fundamental
basis for numerous data- driven studies, with researchers incline
to utilise high- quality and highly usable datasets. Hence, it is
necessary to establish rigorous standards for data paper publica-
tion and provide convenience for data reuse (Kotti etal.2020).
Achieving a standardised criteria for data paper review proce-
dures poses a significant challenge due to differences in disci-
plines and data types (Lawrence etal.2011). Data journals and
reviewers serve as crucial gatekeepers in facilitating an efficient
and effective system for validating and disseminating data-
driven research. In this study, we have gathered review guide-
lines from data journals and open review reports of data papers
to provide a comprehensive perspective encompassing both the
journal's standpoint and the reviewer's view point, seeking to ad-
dress the following questions:
RQ1. How do data journals advise on and regulate reviewers'
behaviour during data paper review process?
RQ2. How do reviewers review data papers in practice?
2 | Literature Review
2.1 | Data Journal and Data Paper
The availability of data to the public is in line with the open
science movement. However, datasets deposited in public re-
positories and databases may not undergo independent qual-
ity cheques. Data journals, serving as gatekeepers, ensure the
adherence of datasets to established standards, provision of
comprehensive metadata, and minimal occurrence of errors
(Costello etal.2013). Unlike traditional research papers, which
briefly describe data collection and mainly hig hlight discussion
and scientific results, data journals prioritise research data.
They establish a connection between data repositories and
scholarly journals, emphasising sufficient description of meth-
ods for collecting and processing data to facilitate data resource
reuse (García- García etal.2015). After examining some of the
characteristics of data journals in detail, Walters(2020) points
out that the distinctive merits of data journals lie in the rig-
orous quality control of data and documentation, increased
discoverability, incentivising data publication, sustainability
assurance and augmentation of research output efficiency. As
an effective means for data discovery and dissemination, data
papers offer concrete incentives such as getting citations, re-
ceiving feedback and the opportunities of new collaborations
(Suhr etal.2020).
Data papers support researchers' need of data sharing. They
act as a hybrid form encompassing both pure data descriptions
and traditional analytical papers. These papers possess norma-
tive citation potential, ensuring that data featured with lasting
value to academic community is easily findable and reusable
(Thelwall 2020). The formal process of publishing data papers
in academic journals facilitates researchers in accessing, com-
prehending, reproducing, expanding upon, and citing data
within their research endeavours (Kong etal.2019). Currently,
data journals and data papers appear to increase slowly in vol-
ume and speed, and data papers are not generally counted in
academic evaluation systems. Nevertheless, these publications
significantly contribute to the assessment of data value, bringing
with them stringent requirements for metadata standardisation
and specialisation to complement research papers. Moreover, the
genre of data papers does foster data sharing and reuse of valu-
able datasets (Jiao and Darch 2020; Schöpfel etal.2020). They
enable a division of labour in which resource- endowed entities
can perform experiments and surveys to create useful datasets,
so that others with the capacity to reproduce the data may reuse
them when appropriate (Rees2010). Overall, the basic concepts
of data journals and data papers are more established, and their
distinctive scientific value is being acknowledged by scholars,
while their role in scientific communication and publishing sys-
tems remains to be further investigated.
2.2 | Peer Review of Data Paper
Researchers can effectively ‘stand on the shoulders of giants’ to
conduct innovative research by reproducing and reusing published
datasets (Jiao etal.2024). The peer review process has long served
as a safeguard in academic ecosystem, and recent emphasis on
transparency and openness has influenced the evolution of review
practices for data papers. Publication of review reports encourages
thorough manuscript assessment and thoughtful comments from re-
viewers (Seo and Kim 2020; Wolfram etal. 2021). Considering the
challenges associated with assessing data quality due to large volume
and complex structure of datasets, the disclosure of the review pro-
cess may serve as means for proving ev idence of data papers (Candela
etal.2015). Open peer review leads to increased page views, down-
loads and sharing of papers, thereby promoting knowledge dissemi-
nation within the scientific community (Wei etal.2023). However,
this model is not widely adopted for review ing data papers yet. Many
journals still opt for traditional models like single- anonymous peer
review. Therefore, there is room for further research on implement-
ing open peer rev iew in evaluating data papers while ensuring high-
quality reviews.
Journal guidelines play a crucial role in shaping the structure of data
papers, which enduring implications for the dissemination of data-
focused knowledge within the research system (Li and Jiao 2022).
They serve as essential references and constraints for authors and
Summary
We identified a dearth of guidance and transparency
in the peer review process for data papers published in
scholarly journals at present.
• Among journals that provide review guidelines for
data papers, these guidelines generally cover key
stages of the data life cycle, with varying emphasis on
specific aspects.
• We conducted an analysis of the primary concerns
raised by reviewers during the evaluation of data pa-
pers, utilising the open review reports provided by the
journal Data.
• The convergence of review dimensions between the
review guidelines and the review reports signifies a
consensus reached by journals and reviewers in eval-
uating data papers.
3 of 14
reviewers alike. Considering the primary objective of data publication,
it is imperative that data papers provide extensive data documentation
conforming to the guidelines stipulated by data journals. Kim(2020)
categorised contextual information from templates and guidelines of
data journals into four groups: general dataset properties, data produc-
tion information, repository information, and data reuse information.
In addition, through an examination of data journal review policies
and consultation with editors, Seo and Kim(2020) found that peer re-
view of data focuses on the appropriateness of data production meth-
odology and detailed descriptions of the methodology. Data journals
recommend or require authors and reviewers to verify aspects such as
data formats, open licensing information, persistent dataset identifiers
while paying particular attention to metrics related to dat a accessibility
and reusability.
The most common peer review models are single- anonymous, double-
anonymous and open peer review. As academic publishing has evolved,
new models have emerged, including transparent, collaboration and
post- publication peer review (Wiley 2020). The open word refers to the
visibil ity of the identities of bot h reviewers and authors, and t ransparent
means that the review report is published alongside the paper. The im-
provements to traditional models and experimentation with emerging
models reflect the evolving nature of peer review. This study highlights
the foundational role of peer review in the quality control for both data
papers and scientific data. It draws upon open- access resources on the
review process to enhance our understanding of peer rev iew specifi-
cally pertaining to data papers. The analysis of review dimensions from
the perspectives of journals and reviewers can contribute to the stan-
dardisation and effectiveness in reviewing processes associated with
data papers.
3 | Materials and Methods
3.1 | Data Journal Collection
According to various factors, including reputation of publish-
ers, disciplinary diversity and journal impact factor, we have
identified 22 scholarly journals that publish data papers. This
identification process involved referring to existing references,
consulting academic institutions' lists, and visiting the home-
pages of the identified journal (Akers 2014). The relevant in-
formation is shown in Table 1, encompassing commercial
academic publishers such as Elsevier, Springer and Wiley,
alongside research university presses like Oxford and MIT that
actively participate in the publication of data journals. Journals
that publish data papers exclusively are labelled as Pure, while
journals that publish both data papers and research papers are
classified as Hybrid. These journals vary significantly in terms
of discipline, but all implement rigorous and well- established
peer review models. The majority of data journals employ a
single- anonymized model for peer review, while transparent
and open review only account for a relatively small percentage
(18.18%).
Among these approaches is the transparent review model ad-
opted by BMC Research Notes where reviewer reports accom-
pany data papers but without disclosing reviewers' identities
(Springer Nature 2020). Earth System Science Data (2024)
employs a two- stage review wherein the topic editor provides
suggestions on basic scientific quality and technical corrections
during the first stage called access review followed by an open
discussion stage where manuscripts become preprints allowing
logged- in users to comment on them.
As shown in Figure1, a subset of the sample journals (13) pro-
vides explicit review guidelines on their homepages, which
serve as useful references for author submissions and reviewer
assessments. The 13 selected journals included both hybrid and
pure data journals, with 61.5% of them (8) providing review
guidelines for data papers. These guidelines address the char-
acteristics of data papers and evaluation dimensions of research
data. The remaining journals (5) present review guidelines spe-
cifically tailored for submissions categorised as research papers.
3.2 | Data Paper Review Reports Collection
The coverage of data papers in existing academic databases is
insufficient, and there are also inconsistencies in the indexing
of data journals and data papers (Jiao et al. 2023). In addition
to the use of the term ‘Data Paper’ (Annals of forest science,
Biodiversity data journal, Ecology, Journal of open archaeology
data, Journal of open humanities data, The international jour-
nal of robotics research, Zookeys), some journals adopt the terms
‘Data Article’ (Big earth data, Chemical data collections, Data
in Brief, Data intelligence, Ecological research, Geoscience data
journal, Global ecology and biogeography), ‘Data Descriptors’
(Data, Earth system science data, Scientific data), ‘Data Note’
(BMC Research notes, GigaScience) and so on. As the material
in this study mainly comes from the Web of Science, where the
publication type is defined as Data Paper, we use the term Data
Paper throughout the manuscript.
In journals that adopt an open peer review model, only the
journal Data's review report is published in conjunction with
the data paper. The journal Data is an open access publication
with high visibility, covering datasets and data- related processes
across disciplines. In terms of the characteristics of open peer
review, the transparency of the journal Data primarily mani-
fests in publishing of review reports and author- reviewer inter-
actions. Through links provided on the journal Data's website,
readers can access and download review reports for data papers
that have undergone open peer review.
For this research, a total of 131 data papers' open review reports
published in the journal Data until 31 October 2023 were col-
lected. The first published data paper including a review report
were published in 2018, and there were only two instances ob-
served. Figure2 depicts the annual quantity of data papers pub-
lished by the journal Data using open peer review from 2019 to
2023, averaging at approximately 26 papers per year. In com-
parison to the overall publication count in the journal Data, the
publication of review reports for data papers shows a relatively
low percentage from the lowest 20% to the highest 40%. This
indicates that the adoption and acceptance of open peer review
by authors and reviewers is still limited. A word frequency anal-
ysis of the keywords of the data papers revealed 718 keywords
that collectively represent the thematic richness of the research
published in journal Data. A significant proportion of these
keywords are related to computer science and information sys-
tems (dataset: 10, remote sensing: 6, machine learning: 6, deep
learning: 5, computer vision: 4, climate change: 4, waste man-
agement: 4, convolutional neural network: 3, circular economy:
3, Sentinel- 2: 3).
4 of 14 Learned Publishing, 2025
3.3 | Methods
As presented in Figures3 and 8 journals of all have provided re-
view guidelines for reviewing data papers, and the journal Data
published open review reports. We have exported 465 review re-
ports of 131 published data papers f rom the journal Data's home-
page, comprising a total of 79,590 words (including paper titles).
Through using NVivo 14.23.0 to annotate the textual material
of the journal guidelines and the data paper review reports, we
constructed two review frameworks respectively. For the review
reports, an initial framework is gradually constructed by means
of annotations to mark up the text of review reports until no
new types are produced. The mark- up process here covers ap-
proximately one- third of the sample, which consists of 45 data
papers. Then, a comprehensive framework is defined by syn-
thesising references from literature, as well as discussion and
modification of the framework among the authors of this paper,
as depicted in Figure7. Finally, NVivo 14.23.0 was used to tag
all review reports in 2683 references. In addition, the 2683 ref-
erences derived from the tagging of review reports were filtered
to obtain comments closely related to data(sets), and were subse-
quently analysed with word frequency statistics.
TABLE  | Characteristics of the collected data journals.
Journal (Publisher)
Publication
types Disciplinary Review model
Annals of Forest Science (Springer
Fr anc e)
Hybrid Multidisciplinary Double anonymized
Atomic Data and Nuclear Data Tables
(Academic Press Inc. Elsevier Science)
Pure Atomic physics and
Nuclear physics
Single anonymized
Big Earth Data (Taylor & Francis) Hybrid Earth science Single anonymized
Biodiversity Data Journal (Pensoft) Hybrid Biodiversity science Single anonymized
Bioinformatics (Oxford University
Pr ess)
Hybrid Genome bioinformatics and
Computational biolog y
Single anonymized
BMC Research Notes (Springer
Natur e)
Hybrid Scientific and clinical
disciplines
Transparent peer review
Chemical Data Collections (Elsevier) Pure Chemistry Single anonymized
Data (MDPI) Hybrid Multidisciplinary Single anonymized and
Optional open peer review
Data in Brief (Elsevier) Pure Multidisciplinary Single anonymized
Data Intelligence (MIT Press) Hybrid Multidisciplinary Single anonymized
Earthquake Spectra (Sage) Hybrid Engineering Single anonymized
Earth System Science Data
(Copernicus Publications)
Hybrid Interdisciplinary Access review and Open discussion
Ecological Research (Wiley) Hybrid Ecology Single anonymized
Ecology (Wiley) Hybrid Ecology Single anonymized
GigaScience (Oxford University Press) Hybrid Life science and
Biomedical research
Open peer review
Geoscience Data Journal (Wiley) Pure Geoscience Single anonymized
Global Ecology and Biogeography
(Wiley)
Hybrid Ecology, Geography Double anonymized
Journal of Chemical & Engineering
Data (American chemical society)
Pure Chemical Engineering,
Chemistry
Single anonymized
Journal of Open Archaeology Data
(U biquity P ress)
Pure Archaeology Single anonymized
Scientific Data (Nature Portfolio) Pure Multidisciplinary Single anonymized
The International Journal of Robotics
Re sear ch (S age)
Hybrid Robotic Single anonymized
Zookey s (Pens of t) Hybrid Zoology Single anonymized
5 of 14
4 | Data Analysis and Results
4.1 | Review Guidelines of Data Journals
At present, there is no uniform principle governing the review
of data papers. Review guidelines provided by journals tend
to emphasise certain phases such as data processing and data
reuse during the overall data publishing process. As shown in
Figure 4, we have identified a total of 10 major phases based
on analysis of guidelines from 8 journals we collected and exist-
ing research (Seo and Kim2020). These phases marked in solid
rectangular boxes collectively encompass the entire publishing
process of scientific data. Furthermore, the main review con-
cerns during each phase are extracted or generalised from these
guidelines, and they are marked in dashed rectangular boxes.
Review guidelines may vary due to differences between pub-
lishers and disciplines. Some guidelines may be derived from
publishers' standardised requirements rather than specific
specifications for data papers. However, based on the obtained
guidelines, it is evident that journals prioritise dataset- related
attributes when evaluating data papers. First, the accessibility
attribute imposed by journals on data papers are reflected in
the requirement for authors to provide a direct link to the data-
set and corresponding repository. It ensures that both data and
analysis tools can be readily accessed and utilised for verifying
data quality. Furthermore, the assignment of a unique identi-
fier is crucial for enhancing accessibility as well as facilitating
long- term data archiving. It is essential that datasets are stored
in an open, non- proprietary format in suitable and sustainable
repositories, enabling easy access during the review process and
ensuring effective management after publication. This approach
not only facilitates reuse of published datasets within their orig-
inal disciplines but also promotes broader utilisation across di-
verse communities.
In scientific community, researchers often play multiple roles as
data generators, reviewers, and disseminators for data sharing
FIGUR E  | Review guidelines distribution among the data journals.
FIGUR E  | Publication information of data papers in the journal
Data.
FIGUR E  | Workflow of this study.
6 of 14 Learned Publishing, 2025
and reuse. Repositories and databases recommended and collab-
orated with by dat a journals include Zenodo, Figshare, Mendeley
Data, and so forth. These platforms can be quickly utilised by
multi- role researchers for various research tasks. Moreover, it
is necessary for the entire dataset to be publicly available with
the aim of achieving the widest range of applications and ensure
optimal scalability within the research community. As a result,
academic journals typically mandate authors to deposit dataset
with open licences such as CC0 waiver or CC- BY licence and
request reviewers to verify their compliance.
Based on the guidelines provided on the journal homepages, we
have identified the review phases of the 8 data journals as shown
in Table2. In review guidelines, these journals place much em-
phasis on the methodology employed for data collection and
processing, as well as the quality of datasets, which are closely
associated with reproducibility and reusability of data. Notably,
GigaScience with reproducibility, usability and utility serving as
key publication criteria, stands out by offering comprehensive
guidelines for reviewing data papers that adhere to FAIR princi-
ples. (Oxford University Press2023).
4.2 | Review Reports of Data Papers
4.2.1 | An Overview of Review Reports
The review guidelines on journals' homepages contain a wealth
of information, including an introduction to the basic work-
flow of peer review, a variety of dimensions for evaluation, and
FIGUR E  | Review phases and their main concerns from review guidelines.
TABLE  | Distribution of review phases in review guidelines of data journals.
Data
Data
in
brief
Earth
system
science
data
Ecological
research Ecology Gigascience
Journal
of open
archaeology
data
Scientific
data
Data preparation
Data creation
Data processing
Data analysis
Data description
Data validation
Data interpretation
Data presentation
Data management
Data reuse
7 of 14
specific questions for reviewers to consider when evaluating
manuscripts and writing review comments. Among the review
reports we obtained from 131 data papers of the journal Data,
only a few were written according to the review templates from
three dimensions of Data description, Data quality, Data ac-
cess, archiving and metadata given by the journal. Reviewers
responded to the main concerns and provided more comprehen-
sive review opinions based on the content of the manuscripts
and their own professional knowledge.
We measured the length of each data paper's review reports as
shown in Figure5. The typical length of review reports for data
papers is between 200 and 700 words, with a certain number of
reviewers (12) submitting reports containing detailed informa-
tion that exceeds 1000 words in length. Additionally, 21 review-
ers present their comments in the attached files (Word or PDF),
which were mostly added in the form of interlinear annotations,
so we did not conduct analysis on them.
In general, data papers have undergone at least one round of peer
review and with 2–4 reviewers per paper according to the pub-
licly available review reports. Excluding one 1062- word report
in the second round, the length of the second round of review
reports is obviously shorter than that of the first (Figure6). In
total, more than half of the data papers underwent two rounds
of review (69), while fewer papers went through three (6) and
four (1) rounds. Among the data papers reviewed in the second
round, a majority (50) had a length less than 100 words, result-
ing in lower median and mean values compared to those in the
third round. This phenomenon can be attributed to reviewers'
brief responses such as checking and confirming suggested
changes that were proposed in the first round. However, man-
uscripts undergoing the third round of review necessitate more
extensive discussions, thereby resulting in an increased length.
Furthermore, there is a difference in the decision suggestions
provided by reviewers at different rounds (Table3). Of the 465
review reports collected, only 132 provided a clear decision sug-
gestion, with the majority of these appearing in the first round
(70%). Overall, the published data papers mainly received com-
ments related to acceptance and revision, indicating that the
reviewers' suggestions played a reference role and were in line
with the basic function of peer review. The editors also have the
authority to consider the comments of several reviewers and
make the final decision regarding the manuscript. The collabo-
ration between the quality gatekeepers and the authors is essen-
tial for the improvement of data papers.
4.2.2 | Analysis of Review Reports
As depicted in Figure 7, the first level of the framework com-
prises six attitude types, while the subsequent level encompasses
specific evaluation targets. A total of 2683 references were iden-
tified and classified into the corresponding review target, with
definitions and examples in Table4. The references exhibiting
explicit attitudes are categorised as Compliment, Criticism,
Inquiry and Suggestion, targeting 18 aspects of the data paper.
Although some data papers are characterised by disciplines
such as geography, chemistry, computer science, and so forth,
the review targets we use are centred on the data and the papers
themselves. These targets are relatively general and neutral, and
can be appropriately applied to reviewing multidisciplinary data
papers. Additional Statement and Others are most often used as
the beginning and end of the review reports. The Statement type
primarily consists of reviewer's reiteration or brief summary
of the title, abstract and main content of the data paper. This
indicates that the reviewer has a general understanding of the
reviewed paper, which is necessary for conducting subsequent
review task. Three targets are grouped into the Others: appreci-
ation, review recommendation and revision problem, which also
occupy a certain proportion (6%) in the review reports. There
are suggestions for Accept, Minor Revision, Major Revision and
Reject.
By applying the review framework, Figure 8 illustrates the re-
sults of comprehensive analysis of the review reports excluding
the type of Statement and Others. Through sentence tagging, it
FIGUR E  | Length of rev iew report of the journal Data.
8 of 14 Learned Publishing, 2025
was observed that manuscript writing is a major concern for re-
viewers, akin to research paper reviews. The expressions in the
reports were categorised based on attitude and purpose, arriving
at four main types: Suggestion (1192), Inquiry (481), Criticism
(327) and Compliment (323). These types are sorted in descend-
ing order according to their prevalence as mentioned above,
with a significantly higher proportion of comments falling
under the category of Suggestion compared to other types. The
phenomenon is in line with the objective of peer review, which
aims to enhance the quality of manuscripts. Direct and specific
FIGUR E  | Length of rev iew report in three rounds.
TABLE  | Distribution of decision suggestions given by reviewers
at different rounds.
Round Accept
Minor
revision
Major
revision Reject
1st 30 36 18 9
2nd 33 3 / 1
3rd 2 / / /
FIGUR E  | Review framework from review reports.
9 of 14
TABLE  | Definitions and examples of evaluation t arget from review reports.
No. Evaluation target Definition and example
1Data collection The process of collecting the data, which includes
the criteria, objects, volume and others.
Example: Although the data collected is interesting, the dataset is based on only
10 buildings. It is suggested to extend the representative samples considerably.
2 Data process and analysis The process involved in the cleaning, transformation, statistical
analysis and other forms of processing data.
Example: The authors can provide more information about the processing of
data and what is the software used and is there any need to clean the data?
3Data description Clear and detailed description of the dataset, especially the metadata of dataset.
Example: The article would also improve if it is added a more
detailed description of the dataset data, with some statistics,
description of the information fields, and so forth.
4 Data publish and access The process of publishing dataset and making it publicly accessible.
Example: The inclusion of a simpler link that redirects to the relevant
repository would undoubtedly enhance the accessibility of your work.
5Data archive and management The process of archiving and long- term management of data.
Example: The authors need to clear about Standartox
database will be maintained by what institution?
6 Data value and usage Assessment of the importance, potential value of the
dataset and the extent to which it can be reused.
Example: Maybe you can comment to me, what to do with this high-
quality data for further research in a few sentences in the summary?
7Data Presentation Graphs, charts or other visualisations used to present data.
Example: Please redraw Figure3. The place names marked in Figure3
are completely unclear. I don't know why the author gave them.
8Journal requirement adaptability Evaluation of the manuscript's compliance with the
scope and requirements of the journal.
Example: This paper fits the scope of the journal and is
aligned with the presentation of this type of work.
9Manuscript topic Evaluation of the novelty, relevance and research value of the manuscript topic.
Example: Odour nuisance from cat urine is well- known problem and
methods to reduce it is needed, so the topic is very relevant.
10 Methods Evaluation of the methods and techniques used in the study and
their suitability and efficacy in addressing research questions.
Example: The methodology is classical and robust, this is, in my
opinion, the sign that the data acquired are reliable.
11 Manuscript structure Examination of the discourse structure and necessary sections of the manuscript.
Example: The paper lacks the conclusion. Even if it is a data
descriptor, it also needs a systematic summary.
12 Writing Evaluation of the manuscript's language, logic, coherence,
and ease of reading for reader's comprehension.
Example: Such manuscripts should be written in the third
person. Change this throughout the text, please.
13 Novelty and contribution Evaluation of the novelty of the insights and methods presented in the
manuscript, and the contribution to scholarship and practice.
Example: The IMU- based system must be emphasised in the abstract and
the introduction since it is the main contribution after the dataset.
(Continues)
10 of 14 Learned Publishing, 2025
suggestions are more likely to assist authors in identifying
problems and adopting effective solutions. Reviewers typically
possess high professionalism and substantial experience in re-
viewing. Therefore, their suggestions not only facilitate authors
in enhancing manuscripts but also provide valuable references
for other researchers due to the openness of the review reports.
First, sentences of the Suggestion type mostly contain modal
verbs and specific improvement measures. Many meticulous re-
viewers focus on precise line- by- line modifications, as well as
put forward optimization requirements on the form and content
of the diagrams in the manuscripts. Moreover, they give pro-
fessional opinions on technical details along with suggestions
for expanding conclusions. Second, Inquiry typically appears
in the form of questions or counter- questions. Reviewers pre-
dominantly inquire about data processing and analysis tech-
niques, data collection, including the selection of methodology
and tools, sample range, noise control and so forth, followed by
requests for explanations regarding apparent errors or unclear
statements in manuscript. These inquires serve both as neces-
sary references for reviewers to assess data quality and pave
the way for readability and usability of data papers. Third, the
judgement of Compliment and Criticism attitude types mainly
depends upon the adjectives used by reviewers. These types of
adjectives, such as interesting, concise and valuable, are em-
ployed to validate paper's innovation, data significance and
writing f luency. Conversely, incomplete, limited and mediocre
adjectives are used to criticise excessive typos as well as insuffi-
cient references and related information.
4.2.3 | Analysis of Review Related to Datasets
In order to gain insights into the review content directly related
to data, we extracted 681 references containing the word data
such as data set, datasets, data sets from all of the review reports
by utilising regular expression. These references were classified
according to previously described attitudes and targets, as de-
picted in Figure9. The extracted references primarily focus on
various aspects of scientific data, encompassing data value and
usage (133), data process and analysis (98), data collection (93).
It underscores the fundamental importance of these elements in
No. Evaluation target Definition and example
14 References and literature review Evaluation of the relevance and comprehensiveness of citations and literature review.
Example: It would be appropriate to add the relevant literature on obtaining
data on the emission of GHG emissions from maritime transport.
15 Related information Evaluation of the background information, underlying theories, tools and
techniques covered in the manuscript. Example: Also, some theory related to
the correlation between trust, satisfaction, perceived value—would be useful.
16 Research design Evaluation of defining research questions, selecting sample and
methods as part of the overall design of the research.
Example: The purpose of the study was clear. However,
the research questions used were unclear.
17 Scientific ethics Reviewers examine whether the authors' research behaviour adhered to
ethical standards and the degree of respect given to research participants.
Example: All these data are from YouTube. If the face data are released, will it
violate some rules? Is it legal to public and process the face data on the website?
18 Scientific finding Evaluation of the impact of the results and findings of the
manuscript on the advancement of the relevant field.
Example: In the end manuscript, state whether your data
processing had solved the problems occurred or not
19 Restatement Reviewers restated the main contents of the manuscript.
Example: The manuscript describes a dataset containing
multispectral images captured using UAV.
20 Appreciation Reviewers appreciated the manuscript's contribution and the authors' revision.
Example: We are grateful to the authors for making the dataset
available online and for the explanation of its structure.
21 Decision suggestion Reviewers put forward suggestions on whether accept the manuscript or not.
Example: The paper can now be accepted and it present and well-
rounded scientific presentation of a needed dataset.
22 Revision problem Reviewers put forward the problems occurred in the revision process.
Example: However, I hope the authors go again and consider my comments in
details. I am not fully satisfied with the authors response to my comments
TABLE  | (Continued)
11 of 14
evaluating the overall quality of data papers. Additionally, de-
tailed feedback on data- related aspects enables other researchers
to replicate and verif y the findings, thereby fostering trust and
advancing progress in scientific community. Besides, there is
also a change in the types of attitudes comparing with the over-
all distribution in Figure 9. While Suggestion still dominates,
Criticism decreases, and the number of Inquiry and Compliment
almost equal. This trend is in line with the gatekeeping role that
peer review plays i n ensur ing rigorous scrutiny of published data.
The Natural Language Toolkit (NLTK) Python package (Loper
and Bird2002) was employed to further analyse the data- related
comments, enabling the identification of verbs, adjectives and
nouns in sentences. Word frequency counts were conducted using
these identified words, and the high- frequency words of each type
are presented in Table 5. These words reflect some consensus
formed by reviewers during the process of evaluating data papers.
As the most frequently used verb, Use encompasses the dual
meaning of use and reuse. The former primarily refers to the
methods and terminolog y employed in the manuscr ipt, while the
latter pertains to the value derived from utilising the data. The
verb Provide is more specific, generally in t he form of suggest ions
to ask the author to add relevant information, detailed interpre-
tations, data descriptions and other materials that are conducive
to review and enhance the value of the manuscript. The high
frequency of adjectives such as available, different once again
demonstrates reviewers' concern for the novelty, data quality
and value assessment with particular attention given to dataset
availability and usability. In addition, the nouns especially high-
light the dataset as the centrepiece of the data paper. Whether it
is deposited in an open, secure and credible repository, whether
it is adequately described, and whether the collection and anal-
ysis are handled properly are the key concerns of the reviewers'
assessment.
5 | Discussion
Data serves as the fundamental cornerstone of research in many
disciplines, as it is through the acquisition of original data and
methodologies that the research can be replicated. Moreover,
only by ensuring proper preservation and management of re-
search data can long- term accessibility be guaranteed. Data pa-
pers play a crucial role in highlighting valuable resources and
emphasising their potential for reuse (McGillivray etal.2022),
FIGUR E  | Distribution of four types of attitudes' references in review reports.
12 of 14 Learned Publishing, 2025
providing a comprehensive narrative description of scientific
data from its inception during material preparation to its cre-
ation, release and the eventual reuse stage.
Peer review does not work in isolation but links to scientific
research, research data, publishing systems, and stakeholders.
The adoption of the open peer review model can augment the
accountability of peer reviewers, potentially mitigating reviewer
bias towards specific authors while incentivising researchers to
engage in peer review by increasing visibility for their contribu-
tions (Waltman etal.2023). Moreover, the peer review process
increasingly includes the review of scientific data, with the t ypes
of data covered in journal scopes influencing data policies. This
aims to uphold the accuracy of conclusions drawn and the utility
of published datasets (Rousi and Laakso 2020). Therefore, we
chose to interpret and analyse journal guidelines and reviewer
reports, as these represent a wealth of practical insights and ex-
periences in data paper peer review.
Prior to data reuse, researchers need to establish trust based
on multiple dimensions, which requires more efforts in prepar-
ing and managing data (Yoon2017). A detailed and accurate
description of the conditions under which data was generated
is crucial for the reproducibility and reusability of research
outputs (Suhr etal.2020). This study reveals that peer review
guidelines in data journals ref lect a practical awareness by em-
phasising the quality of data and presentation to ensure the
credibility of data papers. The review framework is derived
from the integration of multiple journal guidelines (Figure4).
Currently, no journals have comprehensive coverage of all
stages of data publishing. Of these stages, data description and
data reuse have received particular attention, while data cre-
ation and data interpretation have been less well addressed. It
would be beneficial for journals to be coherent across all phases
of data publication, with a particular focus on dataset reviews
and paper- data alignment.
The key points developed by reviewers based on their experi-
ence and expertise in reviewing data papers are also a valu-
able resource for journals to utilise in refining their review
guidelines. To meet the expectations of journals and review-
ers, both the manuscript and the dataset itself must be com-
plete and useful. Reviewers evaluate the manuscript not only
for the novelty of the research topic and its significance to the
scientific community, but also for the structural integrity of
the data paper and the coherence of the data description and
dataset. With respect to datasets, the review is supposed to
address all phases of research data publication, including rel-
evant information support of data preparation, methodology
and quality control during data creation, interpretation and
conclusions from data analysis, format specification and read-
ability for data presentation, accessibility and secure storage
for data management, as well as assessment of potential for
data reuse.
FIGUR E  | Distribution of comments related to data(sets).
TABLE  | Word frequency of verbs, adjectives, nouns in comments
related to data(sets).
Type High frequency words
Verb use(76), provide(68), have(67), present(36),
describe(34), include(31), collect(26),
make(24), show (17), propo se(16)
Adjective available(27), different(29), useful(23),
important(21), possible(16), original(15),
potential(14), clear (13), new(13), sa me (13)
Noun dataset(246), data(173), database(53),
description(51), paper(49),
collection(29), information(26),
section (20), resear ch(20), qua lity(2 0)
13 of 14
6 | Conclusion
Our analysis of guidelines and review reports points out that
there is a significant overlap between data journal guidelines
and reviewer evaluations in terms of review dimensions like
data process and analysis, data reuse. This overlap reflects the
guiding and restraining influence of journals on reviewers,
while also indicates that the two groups have a certain consen-
sus in reviewing data papers.
Through the bridging function of data papers, scientific data
transcends its role as mere output and becomes an accessible re-
source and reusable infrastructure for broader research endeav-
ours. Moreover, peer review plays a pivotal role in establishing
the credibility and transparency of published data papers, while
enhancing their quality to effectively emphasise the data sig-
nificance and reuse value. As scientific community increas-
ingly embraces open science principles, the instrumental role
of data papers and open peer review in promoting data shar-
ing, resource reuse, and scientific collaboration will be further
demonstrated.
Author Contributions
Xinyu Wang prepared and analysed materials, and wrote the manu-
script. Lei Xu prepared concept of the study, and critically reviewed the
manuscript.
Conflicts of Interest
The authors declare no conflicts of interest.
Data Availability Statement
The data that support the findings of this study are available from the
corresponding author upon reasonable request.
References
Akers, K . 2014. “A Growing Li st of Data Journals.” https:// mlibr aryda ta.
wordp ress. com/ 2014/ 05/ 09/ data- journ als/ .
Candela, L., D. Castelli, P. Manghi, and A. Tani. 2015. “Data Journals:
A Survey.” Journal of the Association for Information Science and
Technology 66, no. 9: 1747–1762.
Chavan, V., and L. Penev. 2011. “The Data Paper: A Mechanism
to Incentivize Data Publishing in Biodiversity Science.” BMC
Bioinformatics 12, no. 15: 1–12.
Costello, M. J., W. K. Michener, M. Gahegan, Z.- Q. Zhang, and P. E.
Bourne. 2013. “Biodiversity Data Should Be Published, Cited, and Peer
Reviewed.” Trends in Ecolog y & Evolution 28, no. 8: 454–461.
Earth System Science Data. 2024. “Process of Peer Rev iew and
Publication.” https:// www. earth - syste m- scien ce- data. net/ peer_ review/
inter active_ review_ proce ss. html.
Garcia- Costa, D., F. Squazzoni, B. Mehmani, and F. Grimaldo. 2022.
“Measuring the Developmental Function of Peer Rev iew: A Multi-
Dimensional, Cross- Disciplinary Analysis of Review Reports From 740
Academic Journals.” PeerJ 10: e13539.
García- García, A., A. López- Borrull, and F. Peset. 2015. “Data Journals:
Eclosión de nuevas revistas especializadas en datos.” El Profesional de
la Información 24, no. 6: 845.
Jiao, C., and P. T. Darch. 2020. “The Role of the Data Paper in Scholarly
Communication.” Proce edings of the Associ ation for Information Science
and Technology 57, no. 1: e316. https:// doi. org/ 10. 1002/ pra2. 316.
Jiao, C., K. Li, and Z. Fang. 2023. “How Are Exclusively Data Journals
Indexed in Major Scholarly Databases? An Examination of Four
Dat abases.” Scientific Data 10, no. 1: 737.
Jiao, H., Y. Qiu, X. Ma, and B. Yang. 202 4. “Dissemination Effect of Data
Papers on Scienti fic Datasets.” Journal of t he Associati on for Information
Science and Technology 75, no. 2: 115–131. https:// doi. org/ 10. 1002/ asi.
24843 .
Kim, J. 2020. “An Analysis of Data Paper Templates and Guidelines:
Types of Contextual Information Described by Data Journals.” Science
Editing 7, no. 1: 16–23.
Kong, L., Y. Xi, Y. Lang, Y. Wang, and Q. Zhang. 2019. “A Data Quality
Evaluation I ndex for Data Jour nals.” In Big Scient ific Data Managemen t,
edited by J. Li, X . Meng, Y. Zhang, W. Cui, and Z. Du, 291–300. Springer
International Publishing.
Kotti, Z., K. Krav varitis, K. Dritsa, and D. Spinellis. 2020. “Standing on
Shoulders or Feet? An Extended Study on the Usage of the MSR Data
Pap ers.” Empirical Software Engineering 25, no. 5: 3288–3322.
Lawrence, B., C. Jones, B. Matthews, S. Pepler, and S. Callaghan. 2011.
“Citation and Peer Review of Data: Moving Towards Formal Data
Publication.” International Journal of Digital Curation 6, no. 2: 4 37.
Li, K., and C. Jiao. 2022. “The Data Paper as a Sociolinguistic Epistemic
Object: A Content Analysis on the Rhetorical Moves Used in Data
Paper Abstracts.” Journal of the Association for Information Science and
Technology 73, no. 6: 834–846. https:// doi. org/ 10. 1002/ asi. 24585 .
Loper, E., and S. Bird. 2002. “NLTK: The Natural Language Toolkit.”
ArXiv Preprint Cs/0205028.
McGilliv ray, B., P. M arongiu, N. Pedraz zini, M. Ribar y, M. Wigdorowitz,
and E. Zord an. 2022 . “Deep Impact: A Study on t he Impact of Data Papers
and Datasets in the Humanities and Social Sciences.” Publications 10,
no. 4: 39.
Oxford University Press. 2023. “The Open Science Journal.” https ://
acade mic. oup. com/ gigas cience/ pages/ About .
Penev, L., W. Chavan, T. Georgiev, and P. Stoev. 2012. “Data Papers as
Incentives for Opening Biodiversity Data: One Year of Experience and
Perspectives for the Future.” https:// penso ft. net/ img/ upl/ f ile/ DataP
aperP oster. pdf.
Polka, J. K. , R. Kiley, B. Konforti, B . Stern, and R. D. Vale. 2018. “Publ ish
Peer Reviews.” Nature 560, no. 7720: 545–547. https:// doi. org/ 10. 1038/
d4158 6- 018- 06032 - w.
Rees, J. 2 010. “Recommendat ions for Independent Schola rly Publication of
Data Sets.” https:// ww w. mumble. net/ ~ jar/ artic les/ data- publi cation. pdf.
Rousi, A. M., and M. Laakso. 2020. “Journal Research Data Sharing
Policies: A Study of Highly- Cited Journals in Neuroscience, Physics,
and Operations Research.” Scientometrics 124: 131–152. https:// doi. org/
10. 1007/ s1119 2- 020- 03467 - 9.
Schöpfel, J., D. Farace, H. Prost, and A. Zane. 2020. “Data Papers as a
New Form of Knowledge Organization in the Field of Research Data.”
Knowledge Organization 46, no. 8: 622–638.
Seo, S., and S. J. Kim. 2020. “Data Journals: Types of Peer Review,
Review Criteria, and Editorial Committee Members' Positions.” Science
Editing 7, no. 2: 130–135.
Springer Nature. 2020. “Advancing Peer Review at BMC.” https:// www.
biome dcent ral. com/ about/ advan cing- peer- review.
Squazzoni, F., P. Ahr weiler, T. Barros, et al. 2020. “Unlock Ways to
Share Data on Peer Review.” Nature 578: 512–514. https:// doi. org/ 10.
1038/ d4158 6- 020- 00500 - y.
14 of 14 Learned Publishing, 2025
Suhr, B., J. Dungl, and A. Stocker. 2020. “Search, Reuse and Sharing
of Research Data in Materials Science and Engineering—A Qualitative
Interview Study.” PLoS One 15, no. 9: e0239216. https:// doi. org/ 10. 1371/
journ al. pone. 0239216.
Thelwall, M. 2020. “Data in Brief: Can a Mega- Journal for Data Be
Useful?” Scientometrics 124, no. 1: 697–709. https:// doi. org/ 10. 1007/
s1119 2- 020- 03437 - 1.
Walters, W. H. 2020. “Data Journals: Incentivizing Data Access and
Documentation Within the Scholarly Communication System.” Insight
33, no. 1: 18.
Waltman, L., W. Kaltenbrunner, S. Pinfield, and H. B. Woods. 2023.
“How to Improve Scientific Peer Review: Four Schools of Thought.”
Learned Publishing 36: 334–347. https:// doi. org/ 10. 1002/ leap. 1544.
Wei, C., J. Zhao, J. Ni, and J. Li. 2023. “What Does Open Peer
Review Bring to Scientific Articles? Evidence From PLoS Journals.”
Scientometrics 128, no. 5: 2763–2776. https:// doi. org/ 10. 1007/ s1119 2-
023- 04683 - 9.
Wiley. 2020. “ Types of peer rev iew.” https://authorservices.wiley.com/
Reviewers/journal-reviewers/what-is-peer-review/types-of-peer-re-
view.html .
Wolfram, D., P. Wang, and F. Abuzahra. 2021. “An Exploration of
Referees' Comments Published in Open Peer Review Journals: The
Characteristics of Rev iew Language and the Association Between
Review Scrutiny and Citations.” Research Evaluation 30, no. 3: 314–322.
Yardimci, G. G., H. Ozadam, M. E. G. Sauria, etal. 2019. “Measuring
the Reproducibility and Quality of Hi- C Data.” Genome Biology 20: 57.
Yoon, A. 2017. “Data Reusers' Trust Development.” Journal of the
Assoc iation for Information S cience and Technolog y 68: 946– 956. http s://
doi. org/ 10. 1002/ asi. 23730 .
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Open data as an integral part of the open science movement enhances the openness and sharing of scientific datasets. Nevertheless, the normative utilization of data journals, data papers, scientific datasets, and data citations necessitates further research. This study aims to investigate the citation practices associated with data papers and to explore the role of data papers in disseminating scientific datasets. Dataset accession numbers from NCBI databases were employed to analyze the prevalence of data citations for data papers from PubMed Central. A dataset citation practice identification rule was subsequently established. The findings indicate a consistent growth in the number of biomedical data journals published in recent years, with data papers gaining attention and recognition as both publications and data sources. Although the use of data papers as citation sources for data remains relatively rare, there has been a steady increase in data paper citations for data utilization through formal data citations. Furthermore, the increasing proportion of datasets reported in data papers that are employed for analytical purposes highlights the distinct value of data papers in facilitating the dissemination and reuse of datasets to support novel research.
Article
Full-text available
The data paper is becoming a popular way for researchers to publish their research data. The growing numbers of data papers and journals hosting them have made them an important data source for understanding how research data is published and reused. One barrier to this research agenda is a lack of knowledge as to how data journals and their publications are indexed in the scholarly databases used for quantitative analysis. To address this gap, this study examines how a list of 18 exclusively data journals (i.e., journals that primarily accept data papers) are indexed in four popular scholarly databases: the Web of Science, Scopus, Dimensions, and OpenAlex. We investigate how comprehensively these databases cover the selected data journals and, in particular, how they present the document type information of data papers. We find that the coverage of data papers, as well as their document type information, is highly inconsistent across databases, which creates major challenges for future efforts to study them quantitatively, which should be addressed in the future.
Article
Full-text available
Peer review plays an essential role as one of the cornerstones of the scholarly publishing system. There are many initiatives that aim to improve the way in which peer review is organized, resulting in a highly complex landscape of innovation in peer review. Different initiatives are based on different views on the most urgent challenges faced by the peer review system, leading to a diversity of perspectives on how the system can be improved. To provide a more systematic understanding of the landscape of innovation in peer review, we suggest that the landscape is shaped by four schools of thought: The Quality & Reproducibility school, the Democracy & Transparency school, the Equity & Inclusion school, and the Efficiency & Incentives school. Each school has a different view on the key problems of the peer review system and the innovations necessary to address these problems. The schools partly complement each other, but we argue that there are also important tensions between them. We hope that the four schools of thought offer a useful framework to facilitate conversations about the future development of the peer review system.
Article
Full-text available
The humanities and social sciences (HSS) have recently witnessed an exponential growth in data-driven research. In response, attention has been afforded to datasets and accompanying data papers as outputs of the research and dissemination ecosystem. In 2015, two data journals dedicated to HSS disciplines appeared in this landscape: Journal of Open Humanities Data (JOHD) and Research Data Journal for the Humanities and Social Sciences (RDJ). In this paper, we analyse the state of the art in the landscape of data journals in HSS using JOHD and RDJ as exemplars by measuring performance and the deep impact of data-driven projects, including metrics (citation count; Altmetrics, views, downloads, tweets) of data papers in relation to associated research papers and the reuse of associated datasets. Our findings indicate: that data papers are published following the deposit of datasets in a repository and usually following research articles; that data papers have a positive impact on both the metrics of research papers associated with them and on data reuse; and that Twitter hashtags targeted at specific research campaigns can lead to increases in data papers’ views and downloads. HSS data papers improve the visibility of datasets they describe, support accompanying research articles, and add to transparency and the open research agenda.
Article
Full-text available
Reviewers do not only help editors to screen manuscripts for publication in academic journals; they also serve to increase the rigor and value of manuscripts by constructive feedback. However, measuring this developmental function of peer review is difficult as it requires fine-grained data on reports and journals without any optimal benchmark. To fill this gap, we adapted a recently proposed quality assessment tool and tested it on a sample of 1.3 million reports submitted to 740 Elsevier journals in 2018–2020. Results showed that the developmental standards of peer review are shared across areas of research, yet with remarkable differences. Reports submitted to social science and economics journals show the highest developmental standards. Reports from junior reviewers, women and reviewers from Western Europe are generally more developmental than those from senior, men and reviewers working in academic institutions outside Western regions. Our findings suggest that increasing the standards of peer review at journals requires effort to assess interventions and measure practices with context-specific and multi-dimensional frameworks.
Article
Full-text available
The data paper is an emerging academic genre that focuses on the description of research data objects. However, there is a lack of empirical knowledge about this rising genre in quantitative science studies, particularly from the perspective of its linguistic features. To fill this gap, this research aims to offer a first quantitative examination of which rhetorical moves—rhetorical units performing a coherent narrative function—are used in data paper abstracts, as well as how these moves are used. To this end, we developed a new classification scheme for rhetorical moves in data paper abstracts by expanding a well‐received system that focuses on English‐language research article abstracts. We used this expanded scheme to classify and analyze rhetorical moves used in two flagship data journals, Scientific Data and Data in Brief. We found that data papers exhibit a combination of introduction, method, results, and discussion‐ and data‐oriented moves and that the usage differences between the journals can be largely explained by journal policies concerning abstract and paper structure. This research offers a novel examination of how the data paper, a data‐oriented knowledge representation, is composed, which greatly contributes to a deeper understanding of research data and its publication in the scholarly communication system.
Article
Full-text available
Open research data practices are a relatively new, thus still evolving part of scientific work, and their usage varies strongly within different scientific domains. In the literature, the investigation of open research data practices covers the whole range of big empirical studies covering multiple scientific domains to smaller, in depth studies analysing a single field of research. Despite the richness of literature on this topic, there is still a lack of knowledge on the (open) research data awareness and practices in materials science and engineering. While most current studies focus only on some aspects of open research data practices, we aim for a comprehensive understanding of all practices with respect to the considered scientific domain. Hence this study aims at 1) drawing the whole picture of search, reuse and sharing of research data 2) while focusing on materials science and engineering. The chosen approach allows to explore the connections between different aspects of open research data practices, e.g. between data sharing and data search. In depth interviews with 13 researchers in this field were conducted, transcribed verbatim, coded and analysed using content analysis. The main findings characterised research data in materials science and engineering as extremely diverse, often generated for a very specific research focus and needing a precise description of the data and the complete generation process for possible reuse. Results on research data search and reuse showed that the interviewees intended to reuse data but were mostly unfamiliar with (yet interested in) modern methods as dataset search engines, data journals or searching public repositories. Current research data sharing is not open, but bilaterally and usually encouraged by supervisors or employers. Project funding does affect data sharing in two ways: some researchers argue to share their data openly due to their funding agency’s policy, while others face legal restrictions for sharing as their projects are partly funded by industry. The time needed for a precise description of the data and their generation process is named as biggest obstacle for data sharing. From these findings, a precise set of actions is derived suitable to support Open Data, involving training for researchers and introducing rewards for data sharing on the level of universities and funding bodies.
Article
This study examined the impact of open peer review (OPR) on the usage and citations of scientific articles using a dataset of 6441 articles published in six Public Library of Science (PLoS) journals in 2020–2021. We compared OPR articles with their non-OPR counterparts in the same journal to determine whether OPR increased the visibility and citations of the articles. Our results demonstrated a positive association between OPR and higher article page views, saving, sharing, and a greater HTML to PDF conversion rate. However, we also found that OPR articles had a lower PDF to citations conversion rate compared to non-OPR articles. Furthermore, we investigated the effects of OPR on citations across various citation databases, including Web of Science, Scopus, Google Scholar, Semantic Scholar, and Dimensions. Our analysis indicated that OPR had a heterogeneous impact on citations across these databases. These findings provide compelling evidence for stakeholders, such as policymakers, publishers, and researchers, to participate in OPR and promote its adoption in scientific publishing. Additionally, our study underscores the importance of carefully selecting bibliographic databases when assessing the effect of OPR on article citations.
Article
Journals that adopt open peer review (OPR), where review reports of published articles are publicly available, provide an opportunity to study both review content characteristics and quantitative aspects of the overall review process. This study investigates two areas relevant to the quality assessment of manuscript reviews. First, do journal policies for reviewers to identify themselves influence how reviewers evaluate the merits of a manuscript based on the relative frequency of hedging terms and research-related terms appearing in their reviews? Second, is there an association between the number of reviews/reviewers and the manuscript’s research impact once published as measured by citations? We selected reviews for articles published in 17 OPR journals from 2017 to 2018 to examine the incidence of reviewers’ uses of hedging terms and research-related terms. The results suggest that there was little difference in the relative use of hedging term usage regardless of whether reviewers were required to identify themselves or if this was optional, indicating that the use of hedging in review contents was not influenced by journal requirements for reviewers to identify themselves. There was a larger difference observed for research-related terminology. We compared the total number of reviews for a manuscript, rounds of revisions, and the number of reviewers with the number of Web of Science citations the article received since publication. The findings reveal that scrutiny by more reviewers or conducting more reviews or rounds of review do not result in more impactful papers for most of the journals studied. Implications for peer review practice are discussed.
Article
Data sharing and reuse promise many benefits to science, but many researchers are reluctant to share and reuse data. Data papers, published as peer‐reviewed articles that provide descriptive information about specific datasets, are a potential solution as they may incentivize sharing by providing a mechanism for data producers to get citation credit and support reuse by providing contextual information about dataset production. Data papers can receive many citations. However, does citation of a data paper mean reuse of the underlying dataset? This paper presents preliminary findings from a content‐based citation analysis of data papers (n = 103) published in two specialized data journals, one in earth sciences and one in physical and chemical sciences. We conclude that while the genre of data papers facilitates some data sharing and reuse, they fail to live up to their full potential. Further, practices of reuse of datasets from data papers vary considerably between disciplines. We propose measures for academic publishers to enhance the data paper's role in scholarly communication to attract more attention from researchers and to inform discipline‐specific policy and practices related to data publication.