Article

Data Papers as a New Form of Knowledge Organization in the Field of Research Data

IMR Press
Knowledge Organization
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Data papers have been defined as scholarly journal publications whose primary purpose is to describe research data. Our survey provides more insights about the environment of data papers, i.e., disciplines, publishers and business models, and about their structure, length, formats, metadata, and licensing. Data papers are a product of the emerging ecosystem of data-driven open science. They contribute to the FAIR principles for research data management. However, the boundaries with other categories of academic publishing are partly blurred. Data papers are (can be) generated automatically and are potentially machine-readable. Data papers are essentially information, i.e., description of data, but also partly contribute to the generation of knowledge and data on its own. Part of the new ecosystem of open and data-driven science, data papers and data journals are an interesting and relevant object for the assessment and understanding of the transition of the former system of academic publishing.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Namenjeni so zagotavljanju lažjega iskanja raziskovalnih podatkov, njihove razpoložljivosti in možnosti ponovne uporabe. So ključni del upravljanja z raziskovalnimi podatki in so medsebojno povezani z repozitoriji, v katerih so podatki objavljeni (Schöpfel et al. 2019;2021). Tako prispevajo k načelom FAIR za upravljanje in skrbništvo znanstvenih podatkov (Wilkinson et al. 2016). ...
... Njihova poglavitna vloga je obveščanje o obstoju in dostopnosti raziskovalnih podatkov v spletnih repozitorijih. S tem omogočajo, da je podatkovne zbirke mogoče najti, in tako spodbujajo k njihovi ponovni uporabi (Chavan et al. 2011;Schöpfel et al. 2019;Kim 2020). Vendar pa lahko podatke danes vsakdo objavi na spletu brez kakršnegakoli nadzora nad njihovo kvaliteto ali dolgoročnega načrta upravljanja, ki sta za njihovo znanstveno vrednost in uporabnost bistvena. ...
... Zato je ključno, da so raziskovalni podatki objavljeni v zaupanja vrednem repozitoriju, kjer so opremljeni s trajnim identifikatorjem (npr. DOI, PID) ter predstavljeni v recenziranem podatkovnem članku, ki zagotavlja nadzor nad njihovo kvaliteto (Chavan et al. 2011;Schöpfel et al. 2019). Obenem so podatkovni članki spodbuda strokovnjakom in znanstvenikom, ki v pridobivanje podatkov in njihovo urejanje vložijo veliko dela, da jih objavijo v ustreznih repozitorijih in za svoje delo prejmejo ustrezno akademsko priznanje (Callaghan et al. 2012). ...
... Data papers serve as a bridge between data and traditional papers (Park et al., 2018), detailing the methods of creating and processing data, their structure and format, and their potential for reuse. Data papers are essential for ensuring the quality of published data and metadata (Gorgolewski et al., 2013), as well as the discovery, access, acquisition, understanding, reusing, long-term preservation, and management of scientific data (Chen, 2017;Kim, 2020;Schöpfel et al., 2020), particularly in cases of "little science" and "long tail" data (Heidorn, 2008;Kratz & Strasser, 2014). ...
... The results show that "Data in Brief" became the most important data journal in 2016. Schöpfel et al. (2020) established a representative sample of data journals. They used lists from the European FOSTER Plus project, the German wiki forschungsdaten.org, ...
... Callaghan et al. (2012) offer a straightforward definition, stating that a data paper addresses "information about what, where, why, how, and whose data" rather than presents original findings. Schöpfel et al. (2020) propose a more comprehensive definition: a peerreviewed, citable article in an academic journal whose main aim is to describe an existing research dataset to facilitate discovery, availability, and reusability. Researchers investigated the components and patterns of data papers by using a content analysis approach. ...
Article
Full-text available
Open data as an integral part of the open science movement enhances the openness and sharing of scientific datasets. Nevertheless, the normative utilization of data journals, data papers, scientific datasets, and data citations necessitates further research. This study aims to investigate the citation practices associated with data papers and to explore the role of data papers in disseminating scientific datasets. Dataset accession numbers from NCBI databases were employed to analyze the prevalence of data citations for data papers from PubMed Central. A dataset citation practice identification rule was subsequently established. The findings indicate a consistent growth in the number of biomedical data journals published in recent years, with data papers gaining attention and recognition as both publications and data sources. Although the use of data papers as citation sources for data remains relatively rare, there has been a steady increase in data paper citations for data utilization through formal data citations. Furthermore, the increasing proportion of datasets reported in data papers that are employed for analytical purposes highlights the distinct value of data papers in facilitating the dissemination and reuse of datasets to support novel research.
... The FAIR principles 3 were published in 2016 with the goal to provide guidelines for the management of the research workflow, particularly with regard to data, respecting the criteria of findability, accessibility, interoperability, and reuse. Since 2018, Coalition S has supported Plan S, an initiative which requires scientific publications resulting from research funded publicly to be published in "compliant Open Access journals of platforms" 4 . Among the institutions that have joined this initiative are UK Research and Innovation, The Research Council of Norway, and the National Science Centre in Poland. ...
... Callaghan et al. [3] define a data paper as an article that aims at describing a dataset that is available in open access, the context in which it was created, the process of creation behind it and its reuse potential (the 'what, where, why, how and who of the data'). Schöpfel et al. [4] give an overview of the discussion by presenting the different objectives outlined by journals that launched their data paper sections. They mention, for instance, the opportunity to obtain datasets of higher quality, offer other researchers and students the chance to access the data, open to new insights or research angles on the data. ...
... The first data journal, the Journal of Chemical and Engineering Data, was launched in 1956 [13], but the number of data journals has grown only in recent times. Schöpfel et al. [4], who updated the study performed by Candela et al. [14] on the number of data journals and the areas of interest covered by them, show that the number of data journals did not increase dramatically during those five years (from 20 data journals in 2015 to 28 in 2019, some of them no longer active), while the number of data papers published rose sharply from 846 in 2013 to 11,500 in 2019. Likewise, Walters [15], found that of the 169 journals that reported to publish research relating to data, only 19 journals (11.2%) were classified as "pure" data journals, such that at least half the journals' publications were data reports, 109 (64.5%) devoted some publications to data reports (about 1.6%) but prioritised other types of publications, 21 (12.4%) ...
Article
Full-text available
The humanities and social sciences (HSS) have recently witnessed an exponential growth in data-driven research. In response, attention has been afforded to datasets and accompanying data papers as outputs of the research and dissemination ecosystem. In 2015, two data journals dedicated to HSS disciplines appeared in this landscape: Journal of Open Humanities Data (JOHD) and Research Data Journal for the Humanities and Social Sciences (RDJ). In this paper, we analyse the state of the art in the landscape of data journals in HSS using JOHD and RDJ as exemplars by measuring performance and the deep impact of data-driven projects, including metrics (citation count; Altmetrics, views, downloads, tweets) of data papers in relation to associated research papers and the reuse of associated datasets. Our findings indicate: that data papers are published following the deposit of datasets in a repository and usually following research articles; that data papers have a positive impact on both the metrics of research papers associated with them and on data reuse; and that Twitter hashtags targeted at specific research campaigns can lead to increases in data papers’ views and downloads. HSS data papers improve the visibility of datasets they describe, support accompanying research articles, and add to transparency and the open research agenda.
... How? By following some kind of dissemination procedure like the one proposed in [20] in order to identify correctly the RD set of files, to set a title and the list of persons in the producer team (that can be completed with their different roles), to determine the important versions and associated dates, to write the associated documentation, to verify the legal [19,32] (and ethical) context of the RD, including issues like data security and privacy, and to give the license settling the sharing conditions, etc., which can include the publication of a data paper [35,36]. In order to increase the return on public investments in scientific research, RD dissemination should respect principles and follow guidelines as described in [2,34,37]. ...
... How? An important similarity between RS and RD is that, currently, they have not got a publication procedure as widely accepted as the one existing for articles published in scientific journals (see [20]), despite the fact that data papers and software papers [35,36] are becoming increasingly popular and there are more and more suggestions about where to publish these kinds of papers (see, for example, the Software Sustainability Institute list of Journals in which it is possible to publish software at https://www.so ftware.ac.uk/top-tip/which-journals-should-i-publish-my-software (accessed on 11 November 2024) or the CIRAD (French Agricultural Research Centre for International Development) list of Scientific Journals and book editors at https://ou-publier.cir ad.fr/revues? ...
Article
Full-text available
In the context of Open Science, the importance of Borgman’s conundrum challenges that have been initially formulated concerning the difficulties to share Research Data is well known: which Research Data might be shared, by whom, with whom, under what conditions, why, and to what effects. We have recently reviewed the concepts of Research Software and Research Data, concluding with new formulations for their definitions, and proposing answers to these conundrum challenges for Research Data. In the present work we extend the consideration of the Borgman’s conundrum challenges to Research Software, providing answers to these questions in this new context. Moreover, we complete the initial list of questions/answers, by asking how and where the Research Software may be shared. Our approach begins by recalling the main issues involved in the Research Software definition, and its production context in the research environment, from the Open Science perspective. Then we address the conundrum challenges for Research Software by exploring the potential similarities and differences regarding our answers for these questions in the case of Research Data. We conclude emphasizing the usefulness of the followed methodology, exploiting the parallelism between Research Software and Research Data in the Open Science environment.
... Another significant recent development concerning research data is the academic genre of the data paper, which gradually took shape in the early 2010s as a "scholarly publication of a searchable metadata document describing a particular online accessible dataset, or a group of datasets, published in accordance to the standard academic practices" (Chavan & Penev, 2011, p. 3). It serves as a descriptor and citable proxy of data objects in the bibliographic universe, so that research data can be more findable, citable, and reusable under the current research infrastructure (Gorgolewski et al., 2013;Li et al., 2020;Li & Jiao, 2021), goals that are consistent with the FAIR principles (Groth et al., 2020;Schöpfel et al., 2020). Moreover, data papers are making it easier for research data to be peer-reviewed, a significant prerequisite for the integration of data objects into the research system (Costello et al., 2013;Mayernik et al., 2015). ...
... Candela and colleagues' survey (2015) identified more than 100 data journals publishing data papers by the year 2015. Yet the landscape is fluid: new journals are being established every year, and many of these data journals have ceased to publish as well (Schöpfel et al., 2020;Stuart, 2017). ...
Preprint
Full-text available
As part of the data-driven paradigm and open science movement, the data paper is becoming a popular way for researchers to publish their research data, based on academic norms that cross knowledge domains. Data journals have also been created to host this new academic genre. The growing number of data papers and journals has made them an important large-scale data source for understanding how research data is published and reused in our research system. One barrier to this research agenda is a lack of knowledge as to how data journals and their publications are indexed in the scholarly databases used for quantitative analysis. To address this gap, this study examines how a list of 18 exclusively data journals (i.e., journals that primarily accept data papers) are indexed in four popular scholarly databases: the Web of Science, Scopus, Dimensions, and OpenAlex. We investigate how comprehensively these databases cover the selected data journals and, in particular, how they present the document type information of data papers. We find that the coverage of data papers, as well as their document type information, is highly inconsistent across databases, which creates major challenges for future efforts to study them quantitatively. As a result, we argue that efforts should be made by data journals and databases to improve the quality of metadata for this emerging genre.
... Nevertheless, these publications significantly contribute to the assessment of data value, bringing with them stringent requirements for metadata standardisation and specialisation to complement research papers. Moreover, the genre of data papers does foster data sharing and reuse of valuable datasets (Jiao and Darch 2020;Schöpfel et al. 2020). They enable a division of labour in which resource-endowed entities can perform experiments and surveys to create useful datasets, so that others with the capacity to reproduce the data may reuse them when appropriate (Rees 2010). ...
Article
Full-text available
The evolution of data journals and the increase in data papers call for associated peer review, which is intricately linked yet distinct from traditional scientific paper review. This study investigates the data paper review guidelines of 22 scholarly journals that publish data papers and analyses 131 data papers' review reports from the journal Data. Peer review is an essential part of scholarly publishing. Although the 22 data journals employ disparate review models, their review purposes and requirements exhibit similarities. Journal guidelines provide authors and reviewers with comprehensive references for reviewing, which cover the entire life cycle of data. Reviewer attitudes predominantly encompass Suggestion, Inquiry, Criticism and Compliment during the specific review process, focusing on 18 key targets including manuscript writing, diagram presentation, data process and analysis, references and review and so forth. In addition, objective statements and other general opinions are also identified. The findings show the distinctive characteristics of data publication assessment and summarise the main concerns of journals and reviewers regarding the evaluation of data papers.
... Given its strong relevance to the data-driven research paradigm, this new academic genre has gained traction in recent years, with the number of published data papers surpassing ten thousand and continuing to grow. Some universities and researchers now consider data paper submission a standard practice in data publication (Schöpfel et al., 2020;Thewall, 2020). ...
Article
Full-text available
Introduction. Research data sharing and reuse have become increasingly important in modern science, and data papers represent a new academic publication genre aimed at enhancing the visibility, sharing, and reuse of research data. However, whether citations to data papers reflect actual data reuse remains largely unexplored. This paper presents preliminary findings from a project designed to address this gap. Method. we conducted a content analysis to manually annotate 437 citation sentences from 309 research articles referencing 50 data papers published in Data in Brief, a chief academic journal that only publishes data papers. The data papers were sampled from five knowledge domains based on a paper-level classification system. Results. Our results show that most citations to all selected data papers (89%) are unrelated to the research data being described in the paper, instead focusing on the research findings or methodologies. This suggests that data papers are being cited similarly to traditional research articles, despite their unique purpose and content. Conclusion. These findings raise questions about the effectiveness of data papers as representations of research data within the scholarly communication system, as well as their utility in quantitative studies on data reuse.
... A leading archaeological and historical database-SESHAT (Turchin et al. 2019)-is structured by polities and regions, and cities are not part of the data. Given the shoddy standards identified for some of the new interdisciplinary data journals (Schöpfel et al. 2020), it is not surprising that urban history and archaeology already have a case of demonstrably bad data being promoted and used by scholars in other fields: the Chandler city-size data. Chandler's (1987) compilations of city-size data in antiquity and history are full of errors and problems (see M.E. ...
Article
Full-text available
The archaeology of early urbanism is a growing and dynamic field of research, which has benefited in recent years from numerous advances at both a theoretical and a methodological level. Scholars are increasingly acknowledging that premodern urbanization was a much more diverse phenomenon than traditionally thought, with alternative forms of urbanism now identified in numerous parts of the world. In this article, we review recent developments, focusing on the following main themes: (a) what cities are (including questions of definitions); (b) what cities do (with an emphasis on the concentration of people, institutions, and activities in space); (c) methodological advances (from LiDAR to bioarchaeology); (d) the rise and fall of cities (through a focus on persistence); and (e) challenges and opportunities for urban archaeology moving forward. Our approach places people—with their activities and networks—at the center of analysis, as epitomized by the quotation from Shakespeare used as the subtitle of our article.
... It is officially defined as a "scholarly publication of a searchable metadata document describing a particular online accessible dataset, or a group of datasets, published in accordance to the standard academic practices" 6 . Serving as a descriptor and citable proxy of data objects in the bibliographic universe, it can make research data more findable, citable, and reusable under the current research infrastructure [7][8][9] , goals that are consistent with the FAIR principles 10,11 . Moreover, data papers are making it easier for research data to be peer-reviewed, a significant prerequisite for the integration of data objects into the research system 12,13 . ...
Article
Full-text available
The data paper is becoming a popular way for researchers to publish their research data. The growing numbers of data papers and journals hosting them have made them an important data source for understanding how research data is published and reused. One barrier to this research agenda is a lack of knowledge as to how data journals and their publications are indexed in the scholarly databases used for quantitative analysis. To address this gap, this study examines how a list of 18 exclusively data journals (i.e., journals that primarily accept data papers) are indexed in four popular scholarly databases: the Web of Science, Scopus, Dimensions, and OpenAlex. We investigate how comprehensively these databases cover the selected data journals and, in particular, how they present the document type information of data papers. We find that the coverage of data papers, as well as their document type information, is highly inconsistent across databases, which creates major challenges for future efforts to study them quantitatively, which should be addressed in the future.
... Journal word limits often mean these details get squeezed out of articles during the editing process. One promising solution is to produce a supplementary "data paper" (Schöpfel et al., 2019), which describes data in much more detail than a traditional manuscript. For instance, in a qualitative data paper, a researcher could provide detailed context about the community and historical context in which data were collected, how the data were collected, and any additional information about positionality and reflexivity that other researchers would need to know before using the data. ...
Article
Full-text available
Discussions around transparency in open science focus primarily on sharing data, materials, and coding schemes, especially as these practices relate to reproducibility. This fairly quantitative perspective of transparency does not align with all scientific methodologies. Indeed, qualitative researchers also care deeply about how knowledge is produced, what factors influence the research process, and how to share this information. Explicating a researcher’s background and role allows researchers to consider their impact on the research process and interpretation of the data, thereby increasing both transparency and rigor. Researchers may engage in positionality and reflexivity in a variety of ways, and transparently sharing these steps allows readers to draw their own informed conclusions about the results and study as a whole. Imposing a limited, quantitatively-informed set of standards on all research can cause harm to researchers and the communities they work with if researchers are not careful in considering the impact of such standards. Our paper will argue the importance of avoiding strong defaults around transparency (e.g., always share data) and build upon previous work around qualitative open science. We explore how transparency in all aspects of our research can lend itself toward projecting and confirming the rigor of our work.
... Ta vrsta članka nema, dakle, klasičnu tekstualnu strukturu i organiziranost, to "nisu članci o istraživanju, nego o podatcima", oni "ne donose novo znanje, nego služe generiranju znanja". 31 Autori u tom slučaju ne moraju, ne žele ili ne mogu napisati i objaviti klasični znanstveni članak s analizom rezultata ili žele prikazati opširnije podatke čiji opseg ne dopušta da se ugrade u klasični znanstveni članak. 32 Većina podatkovnih časopisa pripada području prirodnih i biomedicinskih znanosti i objavljuju se u otvorenom pristupu. ...
... There are some non-traditional ways to share data that allow for researchers to provide thick descriptions of research context that traditional papers often lack due to page/word limits. One promising means is the data paper (Schöpfel et al., 2019), which allows researchers to describe their data in much more detail than a traditional manuscript. For instance, in a qualitative data paper a researcher could provide detailed context about the community and historical context in which data was collected, how the data was collected, and any additional information about positionality and reflexivity that other researchers would need to know before using the data. ...
Preprint
Full-text available
Discussions around transparency in open science focus primarily on sharing data, materials, and coding schemes, especially as these practices relate to reproducibility. This fairly quantitative perspective of transparency does not align with all scientific methodologies. Indeed, qualitative researchers also care deeply about how knowledge is produced, what factors influence the research process, and how to share this information. Explicating a researcher’s background and role allows researchers to consider their impact on the research process and interpretation of the data, thereby increasing both transparency and rigor. Researchers may engage in positionality and reflexivity in a variety of ways, and transparently sharing these steps allows readers to draw their own informed conclusions about the results and study as a whole. Imposing a limited, quantitatively-informed set of standards on all research can cause harm to researchers and the communities they work with if researchers are not careful in considering the impact of such standards. Our paper will argue the importance of avoiding strong defaults around transparency (e.g., always share data) and build upon previous work around qualitative open science. We explore how transparency in all aspects of our research can lend itself toward projecting and confirming the rigor of our work.
... Figure 10 presents the number of scientific papers from the sample, distributed by the number of authors. Overall, the sample shows one paper with eleven authors [15], one with ten authors [16], three with nine authors [17][18][19], two with eight authors [8,20], one with seven authors [21], one with six authors [3], seven with four authors [7,9,[22][23][24][25][26], two with three authors [4,27], seven with two authors [10,[28][29][30][31][32][33], and seven with one author (referring to the following authors: Cameron Neylon [1]; Bohyun Kim [34]; William H Walters [35]; Matthew I. Bellgard [36]; Kai Nishikawa [37]; Ayla Stein Kenfield [38]; and David Wilcox [39]). ...
Article
Full-text available
The perceived need to improve the infrastructure supporting the re-use of scholarly data since the second decade of the 21st century led to the design of a concise number of principles and metrics, named FAIR Data Principles. This paper, part of an extended study, intends to identify the main authors, entities, and scientific journals linked to research conducted within the FAIR Data Principles. The research was developed by means of a qualitative approach, using documentary research and a constant comparison method for codification and categorization of the sampled data. The sample studied showed that most authors were located in the Netherlands, with Europe accounting for more than 70% of the number of authors considered. Most of these are researchers and work in higher education institutions. These entities can be found in most of the territorial-administrative areas under consideration, with the USA being the country with more entities and Europe being the world region where they are more numerous. The journal with more texts in the used sample was Insights, with 2020 being the year when more texts were published. Two of the most prominent authors present in the sample texts were located in the Netherlands, while the other two were in France and Australia.
... Os Data Papers são artigos de autoria, revisados por pares e citados em revistas acadêmicas, cujo conteúdo principal é uma descrição dos conjuntos de dados de pesquisa bem como informações básicas sobre a produção e aquisição de dados, com o objetivo de facilitar sua acessibilidade, disponibilidade e reutilização. Estão integrados na gestão de dados de pesquisa e relacionados à Repositórios de Dados (SCHÖPFEL;et al, 2019). ...
Article
Full-text available
Introdução: A estruturação dos conjuntos de dados sobre Biodiversidade está sendo divulgada em uma linguagem reservada para a descrição do substrato da comunicação científica denominada Data Papers, isto é, os dados que sustentam pesquisas científicas nesse campo do conhecimento, independentemente do modelo tradicional de comunicação científica. Objetivo: Analisar as publicações em formato de Data Papers no campo da Biodiversidade em âmbito internacional. Metodologia: Pesquisa documental de abordagem qualitativa e aplica técnicas para coleta e exame das informações por meio de Análise de Conteúdo. Verifica a situação de 33 revistas apontados pela Global Biodiversity Information Facility (GBIF) que oferecem publicações em formato de Data Papers. Identifica-se: os temas correlatos à biodiversidade; os tipos de licenças, indexadores, a quantidade de Data Papers publicados, os títulos que possuem acesso aberto ou fechado, as revistas que mais publicam Data Papers sobre Biodiversidade e o idioma que foram publicados. Resultados: O número em Data Papers teve crescimento exponencial entre 2017 até maio de 2022 logo, os artigos sobre o campo da Biodiversidade também têm aumentado em diversos temas que envolvem todo o seu ecossistema. Conclusão: Os Data Papers analisados se caracterizam como documentos revisados por pares e representam conjuntos de dados indexados com padrões de metadados adequados para preservar digitalmente os dados registrados nas revistas que foram contempladas na presente análise.
... Interpretation: Unlike traditional journal research papers that strictly distinguish between data and discussion/analysis, data papers obscure such distinctions. Thus, there is controversy over whether the role of data papers is to supplement or replace research papers [13]. For now, the value of data papers has been shown to be for promoting data sharing and reuse rather than for obtaining academic recognition, which is complementary to existing research papers. ...
Article
Full-text available
Purpose: This study investigated the usefulness and limitations of data journals by analyzing motivations for submission, review and publication processes according to researchers with experience publishing in data journals.Methods: Among 79 data journals indexed in Web of Science, we selected four data journals where data papers accounted for more than 20% of the publication volume and whose corresponding authors belonged to South Korean research institutes. A qualitative analysis was conducted of the subjective experiences of seven corresponding authors who agreed to participate in interviews. To analyze interview transcriptions, clusters were created by restructuring the theme nodes using Nvivo 12.Results: The most important element of data journals to researchers was their usefulness for obtaining credit for research performance. Since the data in repositories linked to data papers are screened using journals’ review processes, the validity, accuracy, reusability, and reliability of data are ensured. In addition, data journals provide a basis for data sharing using repositories and data-centered follow-up research using citations and offer detailed descriptions of data.Conclusion: Data journals play a leading role in data-centered research. Data papers are recognized as research achievements through citations in the same way as research papers published in conventional journals, but there was also a perception that it is difficult to attain a similar level of academic recognition with data papers as with research papers. However, researchers highly valued the usefulness of data journals, and data journals should thus be developed into new academic communication channels that enhance data sharing and reuse.
... Quality control of data papers-some kind of peer review-always implies an evaluation of the datasets themselves and their respective repositories. But for the moment, this new way of academic publishing represents a very small and marginal part of the overall research output (Schöpfel et al., 2019). ...
Chapter
Full-text available
The outburst of the COVID-19 pandemic has boosted the need for seamless, unrestricted, fast, and free access to the latest research results on the virus, on its treatment, prevention, protocols, and so on. Open access to publications and research data, suddenly, became self-evident, not only for researchers in life and medical sciences but also for politicians, journalists, and society as a whole. At the same time, this sudden awareness triggered another debate on the quality and, moreover, the trustworthiness of this mass of information made available most often without any form of quality control (peer review). Thousands of datasets from research on COVID-19 and related topics have already been deposited on data repositories. Our chapter discusses the issue of the quality and trustworthiness of research data in data repositories using examples from the ongoing pandemic. It offers insights into some fundamental concepts and summarizes recommendations for quality assurance and evaluation of research data.
... This streamlined metadata conversion workflow was first introduced in several of Pensoft's biodiversity journals and then in journals by other publishers, such as Nature's Scientific Data, PLoS One, BMC Ecology, and many others [64]. Since 2011, nearly 300 data papers have been published in Pensoft's journals and there is a steady uptake of this type of publication not only among Pensoft's journals but among journals of other publishers too [65]. Data papers are no longer an abstract idea but have already been practically implemented in multiple journals in different disciplines. ...
Article
Full-text available
Background: Data papers have emerged as a powerful instrument for open data publishing, obtaining credit, and establishing priority for datasets generated in scientific experiments. Academic publishing improves data and metadata quality through peer review and increases the impact of datasets by enhancing their visibility, accessibility, and reusability. Objective: We aimed to establish a new type of article structure and template for omics studies: the omics data paper. To improve data interoperability and further incentivize researchers to publish well-described datasets, we created a prototype workflow for streamlined import of genomics metadata from the European Nucleotide Archive directly into a data paper manuscript. Methods: An omics data paper template was designed by defining key article sections that encourage the description of omics datasets and methodologies. A metadata import workflow, based on REpresentational State Transfer services and Xpath, was prototyped to extract information from the European Nucleotide Archive, ArrayExpress, and BioSamples databases. Findings: The template and workflow for automatic import of standard-compliant metadata into an omics data paper manuscript provide a mechanism for enhancing existing metadata through publishing. Conclusion: The omics data paper structure and workflow for import of genomics metadata will help to bring genomic and other omics datasets into the spotlight. Promoting enhanced metadata descriptions and enforcing manuscript peer review and data auditing of the underlying datasets brings additional quality to datasets. We hope that streamlined metadata reuse for scholarly publishing encourages authors to create enhanced metadata descriptions in the form of data papers to improve both the quality of their metadata and its findability and accessibility.
Conference Paper
Full-text available
Resumo: A publicação de dados de pesquisa em data papers vem crescendo como forma de compartilhamento, elevando o status dos dados de pesquisa a uma publicação científica legítima e capaz de ser indexada por bases de dados. No cenário brasileiro, embora haja uma tímida presença de data papers nos periódicos nacionais, os pesquisadores vêm o adotando. O objetivo deste trabalho é identificar como está ocorrendo a publicação de dados de pesquisa pelos autores afiliados à Universidade Federal de Santa Catarina. Os objetivos específicos são: a) identificar a autoria por meio da afiliação institucional, países e fontes de financiamento; b) caracterizar os periódicos nos quais os data papers foram publicados e c) analisar os repositórios de dados de pesquisa. Foi realizada uma pesquisa de caráter exploratório e descritivo, cujos procedimentos adotados foram a pesquisa bibliográfica e documental. Os resultados permitiram identificar que os data papers publicados por coautores vinculados predominantemente a instituições de ensino superior internacionais, e que receberam financiamentos de agências públicas nacionais e internacionais. Os data journals em sua maioria, são das áreas de Tecnologia, Ciências da Vida e Biomedicina, publicados por publishers comerciais, concentrados no continente europeu. Os repositórios identificados são generalistas e mantidos por parcerias entre universidades e centros de pesquisa, cuja gratuidade é limitada. Com isso, a consolidação da publicação de data papers e do depósito de dados ainda carece de padronização e orientação quanto ao depósito nos repositórios de dados. Além disso, é preciso estar atento a inserção dos publishers comerciais publicação dos dados de pesquisa.
Article
Full-text available
Le présent article s’attache à poser les enjeux et décrire les étapes liées à la rédaction d’un Data paper en Sciences humaines et sociales (SHS), plus spécifiquement en sciences de l’éducation et de la formation musicale (SEFM).Ces pratiques d’écriture et de publication sont récentes d’une manière générale, et nouvelles pour les auteurs de la présente contribution. Par ce partage, nous souhaitons participer au mouvement international de valorisation des données de recherche, dans le cadre de la science ouverte, en fournissant une aide aux chercheurs, notamment dans les disciplines o ̆ cette pratique est encore très peu répandue.La première partie synthétise les principales finalités, caractéristiques et structurations d’un Datapaper en SHS. La deuxième partie pose, pas à pas, les exigences méthodologiques en amont de la rédaction d’un Data paper. La troisième partie correspond à une brève discussion sur les difficultés rencontrées lors de l’expérience de rédaction d’un Data paper dans le cadre du projet MusiPim,Musique et Partenariat Inter-Métiers, qui regroupe des corpus témoignant de situations d’enseignement-apprentissage de la musique en orchestre.
Article
Full-text available
Introduction: Presents definitions, in light of the Open Science movement, on the practice of publishing research data. Objectives: Understand the terminological and conceptual universe, as well as its application in the context of Open Science. Methodology: Database literature review on the practice of publishing research data. Results: Scientific communication via publication of research data tends to increase and will be presented in different formats, and can be terminologically understood in different ways, however, it will always be in tune with the behavior and characteristics of each area of knowledge. Conclusion: This behavior demonstrates the youthfulness of the practice and its absorption in the areas and by actors that make up the scientific communication scenario, providing opportunities for studies of areas in accordance with their information and data standards
Article
A travers une analyse systémique de l’édition scientifique en France – et plus particulièrement via une étude des contextes historiques, socio-économiques et politique – cet article soulève les problématiques éthiques qui ont jalonné la construction de l’écosystème scientifique et l’exploitation des produits de la recherche, et livre une réflexion sur les défis éthiques liés à la mise en place depuis 2018 de la politique de la Science Ouverte mais aussi des conséquences directes que cela va avoir sur les pratiques des chercheurs.
Article
Data papers, a new class of scholarly publication emerging from the open‐science movement, foster data discovery and reuse by offering comprehensive descriptions of research data. Yet, despite their promising growth, the role of data papers in scholarly communication remains underexplored. This work therefore investigates the perceived contributions and functions of data papers to scholarly communication by interviewing 14 data‐paper authors operating in the field of natural science. Using conceptual frameworks adopted from Borgman (2007) and Van de Sompel et al. (2004), we identify four general functions of scholarly communication (i.e., legitimization; dissemination; access, preservation, and curation; and rewarding). Additionally, our data lead us to propose that verification is a distinct scholarly communication, underscoring the importance of data papers in validating research findings in the context of ensuring research transparency. By elucidating the crucial role that data papers now play within the scholarly communication ecosystem, this study seeks to raise the academic community's awareness of their fundamental position, as well as their co‐existence with other forms of data publication, in advancing scientific research.
Article
Data papers, as one of the channels to encourage researchers to open up research data under the open science movement, are expected to provide strong incentives through formal citations. However, few studies have investigated the drivers of this emerging type of publication. This study examines researchers' motivations, and considerations for data paper submission, as well as their perspectives on this scholarly publication. Through an in‐depth interview approach with ten data paper authors, our preliminary results found that, researchers are often driven by extrinsic factors to increase their publications, and data papers are sometimes viewed as territory claims before further research. Although the academic community widely recognizes the benefits of publishing data papers, some still cast a doubtful eye on its academic value and impact. We anticipate such insights on the driving forces and point of views of data papers could provide opportunities for stakeholders to fill gaps and strengthen the open science ecosystem.
Chapter
The landscape of data repositories is very varied and heterogeneous. The issue of trust is at the heart of the development of research data repositories – trust in both the content and the quality of the facility. In computing, a repository is a centralized and organized store of data. It can be one or more databases where files are located for distribution over the network or a place directly accessible to users. The cost of this equipment has led research communities to collaborate around data collection and thus to set up a system of data standardization and dissemination. Data repositories contribute to the mechanism of research data publishing. This chapter proposes that considering data repositories as digital devices, that is, as tools for mediating scientific information between producers and users. The actual use of a new device or technology is affected by several factors, including perceived usefulness, ease of use, quality of services and results.
Article
Full-text available
This article proposes a 4-step model for scientific dissemination that aims to promote evidence-based professional practice in Operations Management or Human Resource Management as well as research with a more transparent and reproducible process. These 4 steps include:1 social network announcements,2 dissemination to scientific journals, 3 dissemination to social networks, and 4 scientific dissemination to professional journals. Central to the 4-step model is a three-stage publication process within the second step, which adds an additional stage to the two previously proposed (Marin-Garcia, 2015). These three publication stages begin with a protocol paper, are followed by a data paper, and finish with a traditional article. Each stage promotes research with merit which is citable and recognizable as such before the scientific evaluation bodies. As two of these stages are largely unknown within the fields of Business and Management, I define the details of a protocol paper and a data paper including their contents. In addition, I provide examples of both papers as well as the other steps of the science dissemination model. This model can be adopted by researchers as a means of achieving greater impact and transfer of research results. This work intends to help researchers to understand, to evaluate, and to make better decisions about how their research reaches society at large outside of academia.In this way, WPOM aligns with the recommendations of several leading journals in the field of business management on the need to promote transparent, accessible, and replicable science (Beugelsdijk et al., 2020). WPOM goes one step further in compliance with this direction by having relevant journals that not only accept, but also actively encourage the publication of protocol papers and data papers. WPOM strives to pioneer in this field of Business and Management.This article also explores the potential prevalence of protocol papers and data papers within the set of all articles published in journals indexed in Clarivate Web of Science and Scopus.With this editorial, WPOM is committed to promoting this model by accepting for review any of the three types of scientific contributions including protocol papers, data papers, and traditional papers.
Article
Achieving the potential of widespread sharing of open research data requires that sharing data is straightforward, supported, and well‐understood; and that data is discoverable by researchers. Our literature review and environment scan suggest that while substantial effort is dedicated to structured descriptions of research data, unstructured fields are commonly available (title, description) yet poorly understood. There is no clear description of what information should be included, in what level of detail, and in what order. These human‐readable fields, routinely used in indexing and search features and reliably federated, are essential to the research data user experience. We propose a set of high‐level best practices for unstructured description of datasets, to serve as the essential starting point for more granular, discipline‐specific guidance. We based these practices on extensive review of literature on research article abstracts; archival practice; experience in supporting research data management; and grey literature on data documentation. They were iteratively refined based on comments received in a webinar series with researchers, data curators, data repository managers, and librarians in Canada. We demonstrate the need for information research to more closely examine these unstructured fields and provide a foundation for a more detailed conversation.
Article
Article à télécharger à : https://www.inshs.cnrs.fr/sites/institut_inshs/files/download-file/lettre_infoINSHS_63v3_0.pdf
Article
Full-text available
Open research data practices are a relatively new, thus still evolving part of scientific work, and their usage varies strongly within different scientific domains. In the literature, the investigation of open research data practices covers the whole range of big empirical studies covering multiple scientific domains to smaller, in depth studies analysing a single field of research. Despite the richness of literature on this topic, there is still a lack of knowledge on the (open) research data awareness and practices in materials science and engineering. While most current studies focus only on some aspects of open research data practices, we aim for a comprehensive understanding of all practices with respect to the considered scientific domain. Hence this study aims at 1) drawing the whole picture of search, reuse and sharing of research data 2) while focusing on materials science and engineering. The chosen approach allows to explore the connections between different aspects of open research data practices, e.g. between data sharing and data search. In depth interviews with 13 researchers in this field were conducted, transcribed verbatim, coded and analysed using content analysis. The main findings characterised research data in materials science and engineering as extremely diverse, often generated for a very specific research focus and needing a precise description of the data and the complete generation process for possible reuse. Results on research data search and reuse showed that the interviewees intended to reuse data but were mostly unfamiliar with (yet interested in) modern methods as dataset search engines, data journals or searching public repositories. Current research data sharing is not open, but bilaterally and usually encouraged by supervisors or employers. Project funding does affect data sharing in two ways: some researchers argue to share their data openly due to their funding agency’s policy, while others face legal restrictions for sharing as their projects are partly funded by industry. The time needed for a precise description of the data and their generation process is named as biggest obstacle for data sharing. From these findings, a precise set of actions is derived suitable to support Open Data, involving training for researchers and introducing rewards for data sharing on the level of universities and funding bodies.
Article
La science ouverte est devenue une priorité de premier rang de la politique de recherche de l’État français. Parmi les axes de cette politique, se trouve l’ouverture des données de recherche et des publications scientifiques. A partir de ses fonctions traditionnelles (qualité, propriété intellectuelle, diffusion, conservation), l'article interroge l'impact de la politique de la science ouverte sur les revues scientifiques, en particulier dans trois domaines : leur rôle dans le fonctionnement des dispositifs d’évaluation et de suivi (fonction politique), leur rôle dans le fonctionnement des dispositifs de données, en particulier pour l’acquisition et l’exploitation des données massives par l’industrie de l’information (fonction Big Data), et leur rôle dans le fonctionnement économique des éditeurs, agrégateurs, agences etc. (fonction économique). Dans la mesure où l’accès aux revues devient gratuit, la question se pose si l’usage des revues ne devient pas le véritable produit des plateformes, par le biais des informations générées.
ResearchGate has not been able to resolve any references for this publication.