Arcle
Proceedings ICSTRDA
Copyright (c) 2021: Advanced Research Publicaons
INFO ABSTRACT
Although, the massive amount of research data is being generated
through modern-day research acvies, lack of awareness regarding the
documentaon, metadata rendering, versioning, le format selecon,
data cleaning, secure storage opons, data deposit templates and making
research data easily accessible through research data repositories has
resulted in invaluable research data becoming lost or rejected. Proper
management and sharing of research data increase the access, impact
and eciency of research acvies. Therefore, as a requirement of
the government funding agencies’ guidelines academic and research
instuons have started to establish research data repository plaorms.
However, to build eecve research data repositories with appropriate
data curaon acvies are needed before uploading and publishing data.
The current study has been undertaken with the primary objecve of
presenng the best data curaon acvies in research data repositories.
The study further gives an overview of the soware tools and applicaons
available for various data curaon acvies viz. data cleaning, metadata
creaon, eding images/ videos, storing data, idenfying and validang
data les, applicaons for data curaon, and data indexing for searches
among other core acvies. The author further idenes the roles of
interdisciplinary librarians and data generators and data providers to
perform the best data curaon acvies in research data repositories.
The methodology of the study was guided by content analysis of
literature on data curaon acvies and the role of interdisciplinary
librarians in data repositories. Addionally, data about the soware
and applicaon tools available for data curaon was collected through
web surveys. The outcome of this study will greatly benet the key
stakeholders in adopng the best pracces for data curaon pracces in
research data repositories for enabling research data sharing and reuse.
It will help in the development of required skills and competencies to
full-l the role of interdisciplinary librarians in data curaon acvies.
Keywords: Data Curaon, Data Cleaning, Research Data Repositories,
Data Librarian, Data Generator
Corresponding Author:
Manu TR, Adani Institute of Infrastructure,
Ahmedabad, Gujarat, India.
E-mail Id:
manutr91@gmail.com
How to cite this arcle:
Manu TR, Gala B. Data Curaon Acvies in
Research Data Repositories: Best Pracces.
Proceedings ICSTRDA 2021; 43-51.
Date of Submission: 2021-02-09
Date of Acceptance: 2021-03-01
Data Curation Activities in Research Data
Repositories: Best Practices
Manu TR1, Bhakti Gala2
1Library, Adani Instute of Infrastructure, Ahmedabad, Gujarat, India.
2School of Library and Informaon Science, Central University of Gujarat, Gandhinagar, India.
44
Virtual Internaonal Conference on
Stascal Tools and Techniques for Research Data Analysis ICSTRDA 2021
21 & 22 January 2021
Introduction
Research Data Curaon is a process to consider data needs
for current and prospecve use, focusing on consultaon
and solutions for improved access, data protection,
citaon and documentaon. Data curaon is the “acve
and on-going management of data through its lifecycle
of interest and usefulness to scholarship, science and
educaon” through acvies that “enable data discovery
and retrieval, maintain quality, add value and provide for
re-use over me (Library, 2021)”. The best data curaon
acvies are required for enabling research data sharing
and reuse through research data repositories. The University
of California, library states the process of research data
curaon includes metadata and documentaon, le and
folder organizaon, storage and preservaon, version
control, the carpentries, data diconaries, etc (University of
California, 2021). The frequently used terms in research data
curaon are archive, preservaon, back-up, le formats, le
sharing, creave commons, agreement, license, metadata,
data repositories, non-proprietary le format, persistent
ideners, standards etc. Majorly eight data curaon steps
are receiving, appraising and selecng, processing, ingesng
and storing, describing (with appropriate metadata),
facilitang access, preserving and reusing research data.
Librarians are increasingly expected to play a role in data
curaon, where they can assist researchers with data
curaon by maintaining and adding value to research data
for current and future use. A librarian can take measures
to ensure the data is documented, maintained and access
through the proper channel. Now a day’s librarians are also
called as data curators, digital curators, data analysts, data
manager, metadata creator etc. according to the role and
responsibilies of librarians within data curaon pracces.
Since data curaon is an area of inter-disciplinary research
and pracce, librarians need to develop the knowledge
on research lifecycle, data policy, data curaon, subject
knowledge on managing data etc poses new challenges
for librarians. And the process of future data curaon
is crucial, ICT in organizaons and libraries with their
preservaon skills and repository experience need to work
together. Thus, globally library professional associaons
have developed the educaon and training frameworks of
skills and competencies required for librarians to extend
their services in data curaon. Librarians can play vital role
in the current data curaon system with creang policy,
data deposit templates, preserving data, administrang
infrastructure, the establishment of collaboraon network
among the data generator to understanding data curaon
needs and the importance of data management and sharing.
Data Curation
Data curaon is the technical funcon that ensures research
datasets are stored and managed in ways that promote
ongoing integrity and accessibility. The data curaon acvity
of managing data throughout its lifecycle, appropriately
maintaining integrity and authenticity, ensuring that
it is properly appraised, selected, securely stored and
made accessible, while remaining usable in subsequent
technology environments. Understanding of data, as well
as research results, data acquision and manipulaon
processes must also curated. The research data curaon can
be performed by the individuals, departments or groups,
instuons, communies, disciplines, publishers, naonal
services and third party services (Rusbridge, 2007). Data
curaon is the signicant role that data librarians, data
curators, research communies play in appraising the value
of data for long-term preservaon. The term appraisal refers
to the method of idenfying digital content’s permanent
value for long-term preservaon. Therefore appraisal in
data curaon has been closely linked to data repository
or data archival policies on research data management
(Ogier, Nicholls and Spee, 2017).
Research Data Repository
Data repository (also referred to as a research data
repository) is a searchable interfacing enty that can store,
manage, maintain and curate Data/ Digital Objects. It
manages the locaon where research data is registered,
permanently stored, made accessible and retrievable, and
curated (Johnston, 2017). Research data repositories are
an opportunity for librarians to leverage their experse in
curaon, outreach and preservaon while strengthening
their long-standing relationships with academic
departments in order to implement robust repositories
that sasfy the needs of their communies (Gerwig, 2017).
Treloar and Wilkinson, (2008) argue that research data
repositories should be support for easy access to data and
other informaon reliable and consistent forms. Gradually
data repositories are increasingly replacing instuonal
repositories of universies (Gowen, Meier, 2020), because
instuonal repositories plaorms (Eprints, Dspace, Digital
Commons, OJS,) also supports data discovery, provenance,
access controls, access, identy management, auding of
use, accountability and impact (Alsaad, O’Hara, Carr, 2019;
Macgregor, 2020).
The re3data.org-Registry of Research Data Repositories
which indexes the research data repositories and oer the
services to the researchers, funding organizaons, libraries
and publishers etc launched 28
th
May, 2013. It indexes over
2300 + research data repositories from around world and
presents data in typologies categories like instuonal,
disciplinary, muldisciplinary project-specic repositories
(Pampel, H., et al., 2013). These repositories are being
sponsored by the Governments, funding agencies, academic
instuons, professional sociees and scholarly publishers
(Goben & Sandusky, 2020). FAIRsharing registry is also
45
Virtual Internaonal Conference on
Stascal Tools and Techniques for Research Data Analysis ICSTRDA 2021
21 & 22 January 2021
a collecon of public research data repositories which
provides the standards, databases, policies collecon/
recommendaons (Suhr, et al., 2020).
The key factors of research data repositories to encourage
research data deposit and sharing, make the repository
more visually appealing, carry out tailored and connuous
advocacy, demonstrate stascs, internaonal interest,
get good visibility in Google’s search engine results, strong
community support for the repository, use of terminology
and the language of ‘repository-speak’, make the deposit
process as easy and streamlined as possible, saving me
with data entry and avoiding duplicaon of eort. Support
and good pracce in managing research data and own IPR
and clearing third-party copyright (Gramstadt, 2012). JISC
has been encouraged in the creaon of several repositories
like EThOS (hp://www.ethos.ac.uk/), JorumOpen (hp://
www.jorum.ac.uk/), Depot (hp://depot.edina.ac.uk/)
etc. in the UK provides the usage services, preservaon
services andshared infrastructure services (Jacobs, Thomas,
& McGregor, 2008).
Literature Review
Data Curaon is not a new term, being well used established
in art and museum pracces (Rusbridge, 2007). However,
it is relavely new in relaon to research data and was
rst used in Russian literature (Kosinov, et al., 2019).
Research data curaon goes beyond data management,
as it comprises addional services to preserve and add
value within the research lifecycle of the research project
and beyond, i.e. by enabling reuse (Partlo, Symons and
Carlson, 2015). It adds value and increases the quality
of data (Ali, 2019) and as it is acve and on-going data
management acvity throughout the research lifecycle, it
enables the data “authencaon, archiving, management,
preservaon, retrieval and representaon” (GEO, 2015). The
Data Curaon Centre (DCC) guide on how to develop RDM
services describes the role and responsibilies of individual
stakeholders who can deliver RDM service (Jones, Pryor,
Whyte, 2013). DCC takes to broad view on data curaon and
is concerned mainly with sustainability and exit strategy,
data resources, access, re-use, preserving and archiving
and me scales (Rusbridge, 2007). Curang research data
is a part of scholarly record and is recognized by research
funders, government agencies and research instutes
(Bryant, Lavoie, Malpas, 2017). Digital curaon is dened
as “maintaining and adding value to a trusted body of digital
informaon for future and current use, specically, the
acve management and appraisal of data over the enre
lifecycle” (Ogier, Nicholls, Spee, 2017).
A research report by Rusbridge (2007) has idened the
top data curaon acvies as documentaon, secure
storage, metadata, data visualizaon, versioning, le
format transforms, quality assurance, soware registry,
contextualize, code review, persistent identifier and
le audit for research data. Bielefeld University library
has formulated minimal data quality framework “Data
Irreproducibility Analyzer (DIRA)” for checking data quality
(Schirrwagen, et al., 2019). UiT University library provides
the data curation services including data collection,
descripon and organizing, analysis, archiving, haring,
re-collecon (Ali, 2019).
Nowadays’ data processing in the data curaon may
range from the simple calculaons made in a spreadsheet
editor, to distributed processes that data using dedicated
soware and hardware tools (Miksa, Rauber, 2015). The
data processing process would be comprehensive guidance,
few process are build workow of curaon/ re-usability,
keep data which has ability to process, make ownership
and allowable uses and make it citable (Rusbridge, 2007).
The level of stang and skills for curang data are the key
to research data curaon and it made easier for fellow
researcher and future collaborators to understanding and
more likely to be trusted (Johnston et al., 2017).
A librarian should be required the more skills on subject
knowledge, IT knowledge, legal knowledge, ethics, research
life cycle awareness including data curaon skills, data
descripon and documentaon skills (Schmidt, Shearer,
2016). The board skills that a librarian required for RDM
implementaon are: providing access to data, advocacy and
support for managing data and managing data collecons
addionally librarians role also related to the open access
and instuonal repositories, collecon development,
advisory services (copyright, policies, etc.), informaon
literacy, digital curation, digital preservation, digital
collecons.
Objectives
The objecves of this research study are governed in
presenng the best data curaon acvies in research
data repositories for enabling eecve research data sharing
and reuse. The broad objecves study as follows:
•
To idenfy the best pracces of the data curaon
process in research data repositories
•
To examine the interdisciplinary role of librarians,
skills and competencies required for providing data
curaon services
•
To highlight the responsibilies of data generators and
data providers
• To give an overview of the soware and applicaons
tools available for various data curaon acvies
Methodology
The researchers adopted the qualitative research
methodology for the study. It was guided by content analysis
of literature published. Scopus citaon and bibliographical
database and Google Scholar scholarly search engine were
46
Virtual Internaonal Conference on
Stascal Tools and Techniques for Research Data Analysis ICSTRDA 2021
21 & 22 January 2021
used to nd the literature on data curaons acvies in
research data repositories, interdisciplinary librarians’ role,
skills and competencies required to perform the research
data curaon acvies. Literature we also analyzed on
the primary role and responsibilies of data generator/
provider in data curaon process to ensure the data access
made available through proper channel. A web-survey was
undertaken to idenfy and collect soware and applicaon
tools available for data curaon acvies including data
repository, data cleaning, metadata schemas, data idener
schemas, controlled vocabularies, creang and eding
metadata, eding images or videos, storing data, idenfying
and validang, data les, transferring data, indexing data
for searches, tracking and measuring data, internet web
applicaons etc.
Findings and Discussion
Data Curation Process
The effective implication of data curation process,
fundamentally required the professional skills, educaons,
domain knowledge and IT skills for the data librarian,
researcher and other stakeholder (Goben, Raszewski, 2015;
Wiljes, Cimiano, 2019). The informaon professionals,
data stewards, data libraries, data curator, IT departments,
metadata experts and IR manager are aspiring the skills on
data storing, managing, archiving and research data sharing,
metadata creaon and metadata analysis. The general
aspects of the data set when received from researcher are
how many les? Total size of the data set, le formats and
soware need to open the les stage of data (e.g., raw,
processed, etc.), is there documentaon available, who
owns the copyrights for this data? What related metadata
standards are commonly used in the data or the eld? Data
sharing concerns, who are the coauthors of the data? Who
funded this research, are there agency requirements for
sharing? What are the instuonal obligaons for data
release? Is there potenally patentable informaon, what
licenses, if any would the data be released under? Are there
sharing concerns, such as protecng the identy of human
subjects? What are the goals for disseminaon (e.g., world
wide access, researcher only access)? Are there exisng
repositories in the eld that you nd and download data
from? Long-term value, are there exisng publicaons that
make use of or cite the data? Will the data change or be
added to over me? How oen? Are there alternave le
formats recommended for deposit? (e.g., the data curator
may recommend a format for preservaon purposes.), is
the data easy or dicult to reproduce and why? What is
the reuse potenal of this data? When, if ever, should the
data be withdrawn or destroyed? etc. To answering these
quesons, the potenal set of data curaon acvies are
needed. The Figure 1 and Table 1, presents the major data
curaon process, major acvies, interdisciplinary librarians
and their required skill set responsibilies of data generator
around the data curaon.
Figure 1.Data Curation Process
47
Virtual Internaonal Conference on
Stascal Tools and Techniques for Research Data Analysis ICSTRDA 2021
21 & 22 January 2021
Table 1.Data Curation Process Best Practices, Interdisciplinary Librarian’s Role, Skill Set and
Responsibilities of Data Generator and Researcher
Data curaon
process Major acvies
Interdis-
ciplinary
librarians
Skill Set Possible responsibilies
of data generator
Preparaon for
data curaon
Creang local policies
and rules Data curaon
workow wizard
Building infrastructure
and facilies Skills
and knowledge for
data curaons, Assign
curaon responsibilies
to appropriate data
curator, Build plan data
curaon structure in data
repository
Head
- Data
Librarian
Understanding data
curaon lifecycle,
Strategic understanding
and inuencing skills,
Understanding the best
pracces of data curaon,
Knowledge of data curaon
acvies, So skills, me
management
Collaboraon with
data curator, librarian,
metadata specialist,
data repository manager
and other stakeholder,
Part of creang internal
data curaon policies
and rules, workow,
Volunteering for data
curaon acvies
Understanding
needs of data
curaon
Understanding researcher
needs, Interviewing data
generator, Consulng
researcher and data
provider, Conceptualize
data with data provider,
Provide outreach for
data repository and data
curaon
Knowledge on researcher’s
needs and available
datasets, Negoaon
and communicaon
skills, Coordinaon skills
across instuons, Ability
to communicate and
collaborate, Ability to work
with data curaon team
Conceptualize data with
data curators Selecon
of data to be curated and
published, Idencaon
of data generate sources
Data receiving
and gathering
Data receiving policy
for researcher, Deport
rights and license
agreement, Facilies
for data deposit, Data
deposit template, Data
gathering with minimum
available metadata
and documentaon,
Acknowledgement of data
recieving
Data
Curator
Knowledge on data
deposit template creaon,
Negoaon skills,
Coordinaon skills across
instuons, Familiar with
research data, Academic
research methods, Ability to
handle data complexity and
diversity, Skills in computer
science
Adhere data deposit
policy Deposit data les
in required template with
data sources, Provide the
minimum metadata of
data Provide data access
rights, Providing Creave
Commons license
Data cleaning
and appraisal
Understanding of data-
stage of data (e.g. raw
and processed etc.), size,
types, format etc. Data
validaon and vericaon
Data cleaning from
stascal perspecve Data
classicaon, Idenfying
qualitave & quantave
errors, Error repairing,
migraon, Consider key
risk factors
Data
Scienst
/ Data
Analysist
Data quality literacy skills,
Ability of data cleaning
and error detecon,
Familiarity with research
data cleaning tools and
applicaon, Familiarity with
data analysis and stascal
tools, Ability to understand
and measure data quality
Knowledge on Geospaal
data and soware,
Knowledge about data
manipulaon technologies
Convert the le formats
to non-propriety formats,
Data classicaon
by stage of data, size
format, types, Minimize
the errors, Extend
operaonal support for
error repairing, Extend
support in data analysis
48
Virtual Internaonal Conference on
Stascal Tools and Techniques for Research Data Analysis ICSTRDA 2021
21 & 22 January 2021
Data
documentaon
and metadata
Data preparaon for
preservaon formang,
le organizaon,
packaging, Metadata
creaon, descripon
Structure and descripve
metadata, Disciplinary
metadata standards /
schema Metadata tools
and resources, Idener
schemas, controlled
vocabularies, Use of
applicaon tools available
for data curaon
Metadata
Specialist
/ Subject
specialist
Familiarity with metadata
elements, standards,
schemas and tools Skills
metadata creaons
Knowledge of data
preparaon methods,
organizaon, Knowledge
on disciplinary metadata
standards parcularly
research data, Knowledge
on subject area Skills on
creaon of metadata
standards according
data sets, Familiarity
with idener schemas,
controlled vocabularies
Familiar with metadata
structure, standards
and schema Metadata
creaon for data, Unique
and proper le naming
system, File arrangement
in hierarchy system, Work
with metadata specialists
to create appropriate
metadata
Long term data
preservaon
Needs of long-term
preservaon Selecng
dataset for long-term
preservaon, Rights and
permissions, Data storage
and security Long-term
preservaon value, Plan
and standardizaon
Data
Manager
Skills on license agreement
and copyright Data long-
term management skills,
Knowledge on policies for
preserving data
Valid datasets for long-
term preservaon,
Provide rights and
permission for archive
Secured authencaon
for condenal data
preservaon
Data publishing
in repository
Find trusted data
repositories, Building
open data repository
Assists of required data
security features, Transfer
the processed data les
to the repository, Data
sharing policies Key
components of data
publishing
Repository
Manager
Technical skills, Soware
required for the repository,
Development and
managing data repositories,
Knowledge on Server and
digital library architecture,
Managing data repository
daily basis, Team work
sills, Maintain and update
repository soware
Suggest and nd
discipline data
repositories, Allow data
sharing and publishing,
Ensuring deposited data
published in repository,
Work with data curator
and repository manager
to upload dataset into
repository
Data access
and reuse
Searchable and
discoverable facilies,
Annotang data for
relevant enes,
Opmizing data to search
engine, Keeping data
up to date into mirror
repository, Data embargo,
authencate access
Monitor data reuse, data
citaons, Consider post-
ingest review techniques,
Provide customer support
User access guide and
Data promoon skills, Data
aenon plaorms, Search
engine opmizaon, Data
repository index in scholarly
search engines, Enabling
automac process for
indexing, Opmizing data
repository content
Make sure published
data accessible to users,
Embargoed access,
Provide request base
condenal data access,
Monitoring data citaon
received, Promoon of
data access, Searching/
browsing / downloading,
Share recommended
citaons and contribute
citaons data
49
Virtual Internaonal Conference on
Stascal Tools and Techniques for Research Data Analysis ICSTRDA 2021
21 & 22 January 2021
Re-evaluaon
of data
Evaluate or view research
data, Withdrawal of data
from repository, Ensuring
future usability, Enabling
data citaons, Regular
data up-to-date
Data
Curator
Evaluang value of
dataSkills to idenfy out
dated data
Evaluate the data
published in data
repository Ensure long-
term data quality
Analyzing data
usage
Monitor the usage,
downloads, view,
citaons, Managing
descripve stascs of
data usage, Data usage
tracking, Data external
users, researcher, funder,
agencies
Data
Manager
Usage analysis and review
skills, Enabling data usage
tools skills, Enabling
Almetrics Improve access
experience, Skills in metrics
Disseminate of data
sets, Social networking
/sharing/tagging, Data
cing
Connuing
pracces
Educang researchers,
Providing workshops
for data analysis tools,
Outreach program for
data curators and instute
community Adopng
best pracces of data
curaons Learning future
technologies
Head
- Data
Librarian
Data literacy skills, Event
organizaonal skills,
Tutorial and training
module development skill,
Networking skills, Subject-
experts
Taking place of
workshops and training
program, Idenfy the
best pracces in data
curaon
Research Data Librarians: Interdisciplinary
Roles and Competencies
Research data curaon process is typically dened as a
set of acvies that required to involved mul skilled
informaon professionals, therefore it described the
data curaon planning, data acquisions, data cleaning,
analysis, data publishing and long-term preservaon etc.
Therefore, interdisciplinary skilled informaon professional
required to impalement best pracces in the data curaon
acvies. As menoned in Table 1, the major informaon
professional are like data librarian, curaon librarian,
digital collecons curator, digital content strategist, data
management consultant, data curaon librarian, digital
projects designer, repository specialist, technical analyst,
repository coordinator, data curaon specialist, repository
coordinator, metadata specialist, system administrator,
soware developer.
The worthy skillset and competencies are required for
such interdisciplinary librarians to develop the best
pracces in data curaon acvies in the research data
repository. The major skills including Understanding of
data curaon lifecycle, Familiarity with research data (e.g.,
Ability to handle data complexity and diversity), Collecon
management skill, Metadata knowledge parcularly for
research data Technical details of repository soware, server
and its architecture Understanding disciplinary metadata,
workow and knowledge on academic research helps
them to build or plan data curaon structure in their data
repository, consult with data providers and connect them
to metadata specialists or repository managers, facilitate
communicaon across dierent enes, outreach and
educate campus community, work with data providers to
help add metadata and upload data into data repository,
helps data provider to create appropriate metadata for
their dataset and provide the support researcher and help
in management of data repositories.
Responsibilities of Data Generator
Data generators and also called data providers are primary
stakeholders of the research data curaon pracces in
instutes. The major data generator of the instute includes
faculty members, postdoc researchers, researchers, graduate
students, undergraduate students and other aicted
researchers who have been involved in creang research
data from a sampled data sources in both qualitave and
quantave studies. Therefore data generator has huge
role to play in the research establishment of best pracces
in the research data curaon. Data generator have to
collaboravely work with data librarian, data curator, data
scienst, metadata specialist, data manager and repository
manger to make sure research data published through
research data repositories and enabling data sharing reuse.
Gradually, researchers have been managing and organizing
the data created to easily retrieving when it is required. But,
for publishing research in a repository the researcher has
to work with the data curaon team to properly preserve
50
Virtual Internaonal Conference on
Stascal Tools and Techniques for Research Data Analysis ICSTRDA 2021
21 & 22 January 2021
research data. Being a data owner, the researcher has to
help metadata specialist to dene the metadata elements
of the data generated by them. The various types of the
data created like data (e.g., raw data), text documents (e.g.,
word, pdf, latex, txt), spreadsheets (e.g., excel), slides (e.g.,
PowerPoint), audios, audio-visuals, images, laboratory
notes, stascal data, databases (e.g., access, MySQL)
would require dierent metadata elements to be used to
dene the data. As menoned above in Table 1, along with
the metadata creaons, the data generator has a key role
to play with Head-Data Librarian for creang internal data
curaon policies and rules, workow, conducng outreach
acvies and idenfying the best pracces in data curaon.
The data generator has to further deposit the research
data in a template made by the data curator, transfer
data access rights, convert le formats to nonproprietary
formats, data classicaon, error minimizing, data analysis
support, browsing/ downloading, etc. The data generator
also helps in promong the research data published through
a research data repository while they share research data
through social networking sites, scholarly search engines,
data repository directories, to increase the data citaons
and widely disseminate the research data across the world.
The data generator can also evaluate the data published
in data repository to ensure the long-term data quality
and up-to-date data available for user access. The data
generator hold the rights to share the condenal data
to users, therefore users have to make formal request to
the data generator through a data repository to avail the
condenal data. Table 1 gives and overview of the key roles
of the data generator in each stage of the data curaon.
Software and Application Tools for Data
Curation
Data curaon process is set of technical and non-technical
acvies, it requires various soware applicaon tools
to implement data curaon process in the research data
repository. There are many open source and property
soware applicaon tools available in the each stage of
data curaon process. Table 2, presents the major list of
applicaon soware available for data curaon in a research
data repository. A data librarian can use the best pracce
template to prepare dra of the data curaon process, and
online survey/ interview tools can be used to understanding
needs of data curaon. A data deposit template helps in
gathering research data from the data generators with
required addional details. However, members of a data
curaon team should have skills to use the below soware
applicaon tools for data curaon acvies.
Types Soware and Applicaon Tools
Data Repository Soware Bepress Digital Commons, DSpace, Hydra, Dataverse, HUBzero, Aubrey, SobekCM,
EPrints, CKAN,
Metadata Schemas
Dublin Core (DC), Qualied DC, DataCite Metadata, MODS, METS, PREMIS, MIX, EAD,
Darwin Core, Ecological Metadata Language (EML), Visual Resources Associaon (VRA
Core), DDI, CIF (Crystallographic Informaon Framework), ABCD (Access to Biological
Collecon Data), AgMES (Agricultural Metadata Element Set), AVM (Astronomy
Visualizaon Metadata), PREMIS
Metadata Schemas used
in Supplementary Space Darwin Core, EML, DDI, TEI, FGDC, ISO 19115 Geographical Metadata (ISO 19115),
Idener Schemas DataCite, DOI, Handle, ARK, HTTP URI, Permanent local URL,
Controlled Vocabularies
DC Contolled Vocabularies, Library of Congress Subject Headings (LCSH), Medical
Subject Headings
(MeSH), Faceted Applicaon of Subject Terminology (FAST),Only with Hydra: DC RDF
Ontology, FOAF,
RDF Schema, Astronomy Thesaurus, NASA Thesaurus, Art & Architecture Thesaurus
Creang and Eding
Metadata
Microso Word, Microso Excel, Text Editor (WordPad, Notepad++),
Oxygen XML Editor, Morpho (Ecology Metadata Editor), Nesstar
Eding Images or Videos SnagIt Photoshop for images, Handbreak for audiovisual, image
Cleaning Data Open Rene, Data Cleaner.
Storing Data Dropbox, Google Drive
Idenfying and Validang
Data Files DROID, PRONOM, Git for version control, FITS for ®le characterizaon
Table 2.Software and Application Tools for Data Curation in Research Data Repository
51
Virtual Internaonal Conference on
Stascal Tools and Techniques for Research Data Analysis ICSTRDA 2021
21 & 22 January 2021
Transferring Data BagIt
Indexing Data for
Searches Apache Solar
Tracking and Measuring
Data Altmetric,
Internet web applicaons EZID service, Google rene,
Conclusion
Data curaon acvity of managing and promong the use
of research data from its point of creaon ensures its usage
for contemporary purpose, helps in for the discovery and
re-use of data. The best pracces of data curaon process
is like a cycle begins from preparaon, understanding the
needs and arrangement of infrastructure prerequisite, data
receiving, cleaning, documentaon, metadata creaon,
preservaon, data publishing, data access, evaluate the
usage and data citaons. Disciplinary metadata creaon
is a major acvity of the data librarians along with data
repository development which helps in publishing data for
future reuse. Librarian’s has to develop the interdisciplinary
and IT oriented skills and competencies along with academic
research knowledge to perform best data curaon pracces.
The interdisciplinary skills help them to curate data,
metadata creaons, descripon and documentaons. IT
oriented competencies serve to develop the user friendly
data repositories and data cleaning. Academic research
knowledge helps librarians to understand the research
acvies and need of the data curaon assistance by data
generators. Appropriate use of soware and applicaon
tools facilitate in the development and pracce of data
curaon. The training and connuing educaon of data
curaon skills in is not limited only to library and informaon
professionals, its scope has been expanded to professionals
form the computer science domain too. This study benets
the key data curaon stakeholders in understanding the best
pracces for data curaon by idenfying the key stages of
data curaon, core data curaon acvies, soware and
applicaon tools available, the role and responsibilies of
each stakeholder including data librarians and researchers
References
1. Ali AK. Research Data Management Support. The 14th
Munin Conference on Scholarly Publishing. Tromsø:
UiT University. 2019.
2.
Bryant R, Lavoie B, Malpas C. The Realies of Research
Data Management Part Four: Sourcing and Scaling
University RDM Services. Dublin: OCLC Research. 2018.
3.
Cudre-Mauroux P. Design Consideraons on SWITCH’s
Connectome Vision. Zorich: A SWITCH Innovaon Lab.
4.
GEO. Data Management Principles Implementaon
Guidelines. Group of Earth Observaons.
5. Gerwig KJ. Current Outreach and Markeng Pracces
for Research Data Repositories. L. Johnston, Curang.
Research Data 2017; 1.
6.
Goben A, Sandusky RJ. Open data repositories: current
risks and opportunies. College & Research Libraries
News 2020; 81(2).
7.
Gramstadt MT. Kulvang Kultur: Increasing Arts
Research Deposit. ARIADNE. 2012.
8.
Jacobs N, Thomas A, McGregor A. Institutional
repositories in the UK: the JISC approach. Library Trends
2008; 57(2): 124-141. doi:10.1353/lib.0.0035
9. Jones S, Pryor G, Whyte A. How to Develop Research
Data Management Services - a guide for HEIs. In DCC
How-to Guides. Edinburgh: Digital Curaon Centre.
10. Library. Research Data Curaon. Retrieved from The
University of Melbourne: hps://library.unimelb.edu.
au/digital-stewardship/research_data_curaon
11. Miksa T, Rauber A. Beyond Data: Process Sharing and
Reuse. ERCIM News 2015; 100: 18-10.
12. Noonan D. Data Curaon and the University Archives.
The American Archivist 2014; 77(1).
13.
Ogier A, Nicholls N, Spee R. Open Exit: Reaching the End
of the Data Life Cycle. L. Johnston, Curang. Research
Data 2017; 1.
14.
Partlo K, Symons D, Carlson JD. Revoluonary or
evoluonary? Making research data management
manageable. B. Eden, Creang Research Infrastructures
in the 21
st
Century Academic Library: Conceiving,
Funding and Building New Facilies and Sta.
15.
Rusbridge C. Create, curate, re-use. Proceedings of
Educause Australasia, 2007. Auckland: EDUCAUS.
16. Schmidt B, Dierkes J. New alliances for research and
teaching support: establishing the Göngen eResearch
Alliance. Program 2015; 49(4): 461-474. doi:10.1108/
PROG-02-2015-0020
17.
Schmidt B, Shearer K. Librarians’ Competencies Prole
for Research Data Management. Chicago: Joint Task
Force on Librarians’ Competencies in Support of
EResearch and Scholarly Communicaon, 2016.
18.
University of California, L. Research Data Curaon.
Retrieved from University of California, LIBRARY: hp://
library.ucmerced.edu/research-data-curaon
19. Wi M. Instuonal Repositories and Research Data
Curaon in a Distributed Environment. Library Trends
2008; 57(2): 191-201.