Article

The National Geospatial Digital Archives— Collection Development: Lessons Learned

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

There are many similarities between building a geospatial digital archive and building a hard-copy map collection, and two major ones are the necessity to have a collection development policy and the amount of hard work required to seek out and acquire the resources. Two institutions, University of California at Santa Barbara and Stanford University, the initial partners in the National Geospatial Digital Archives (NGDA), chose to collect digital data that was in line with each library's standard collection strengths and responsibilities. Collection development policies were written for the project as a whole and for each partner institution. While based on traditional paper map policies, these geospatial collection development policies are tailored specifically for digital data by including sections on metadata, versioning, file formats, proprietary formats, data set size, and ownership/access considerations. During the acquisition phase of the contract a considerable amount has been learned about file formats, data acquisition of compressed vs. uncompressed files, short-term storage prior to repository ingest, and metadata creation. While metadata creation at the collection-level/series-level has been relatively easy the acquisition phase has underscored the challenges inherent in creating accurate item-level metadata. One of the central findings of the NGDA experience is that format information is vital for long-term preservation. Thus, the need to understand file formats and specifications has led to the creation of a format registry specifically for geospatial materials.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This becomes evident searching geosciences journals especially in environmental science where researchers regularly use decades old data to explore processes involving change (Conway et al. 2013;Kennedy et al. 2008;Cushing et al. 2008;Leyk, Boesch, and Weibel 2006). Thus not only climate change research, but also any other kind of environmental change analysis can benefit from long term data preservation (Beruti et al. 2010;Erwin, Sweetkind-Singer, and Larsgaard 2009;Harris 2001;Janée 2008; National Academy of Sciences. National Research Council and Committee on the Preservation of Geoscience Data and Collections 2002; Shaon and Woolf 2011; European Commission. ...
... The lack of consciousness about the value of geographic assets in records management is considered to be most likely to change early. Indeed, the value of such data has been emphasized many times (Lazorchak et al. 2008;Erwin, Sweetkind-Singer, and Larsgaard 2009;North Carolina Center for Geographic Information & Analysis 2009;Shaon et al. 2012;Harris 2001;Caruso et al. 2013). Awareness of the value of these assets seams to find its way into management practice. ...
Article
There is general agreement that spatial data adds particular difficulties to digital preservation due to, for example, the complexity of data models and semantics specific to individual thematic areas. However, there is a lack of literature providing an overview of the challenges and analyzing in particular the effort required to surmount these in combination with the potential added value gained through digital preservation. The Delphi method was used to evaluate obstacles to archiving geographic vector and raster data serving as a basis for topographic base map creation, seen through the lens of data producers, providers and guardians. Two international Delphi groups were questioned on developments regarding geodata, and their influences on access and preservation. The mentioned handicaps to preservation were of financial, managerial, legal, and technological in nature. The latter have a higher probability to be surmounted within at least 10 years than non-technological. The study shows that the lack of standardization and the use of proprietary formats is still a central problem. Furthermore, the consciousness about the value of geographic assets is considered most likely to rise early. As a good starting point for improving archiving of spatial data, we also suggest the controlled disposal of superfluous data as a measure to reduce cost.
... the difference between storing data and preserving it; and preserving only the most important data (Erwin et al., 2009). ...
Article
Full-text available
Purpose The aim of this paper was to explore digital preservation requirements within the wider National Geoscience Data Centre (NGDC) organisational framework in preparation for developing a preservation policy and integrating associated preservation workflows throughout the existing research data management processes. This case study is based on an MSc dissertation research undertaken at Northumbria University. Design/methodology/approach This mixed methods case study used quantitative and qualitative data to explore the preservation requirements and triangulation to strengthen the design validity. Corporate and the wider scientific priorities were identified through literature and a stakeholder survey. Organisational preparedness was investigated through staff interviews. Findings Stakeholders expect data to be reliable, reusable and available in preferred formats. To ensure digital continuity, the creation of high-quality metadata is critical, and data depositors need data management training to achieve this. Recommendations include completing a risk assessment, creating a digital asset register and a technology watch to mitigate against risks. Research limitations/implications The main constraint in this study is the lack of generalisability of results. As the NGDC is a unique organisation, it may not be possible to generalise the organisational findings, although those relating to research data management may be transferrable. Originality/value This research examines the specific nature of geoscience data retention requirements and looks at existing NGDC procedures in terms of enhancing digital continuity, providing new knowledge on the preservation requirements for a number of national datasets.
Book
Full-text available
This selective bibliography presents over 500 English-language articles, books, and technical reports. It covers digital curation and preservation copyright issues, digital formats (e.g., data, media, and e-journals), metadata, models and policies, national and international efforts, projects and institutional implementations, research studies, services, strategies, and digital repository concerns. Most sources have been published from 2000 through February 2011. It is under a under a Creative Commons Attribution License. It is also available as a website with a Google Translate link (https://tinyurl.com/24avtyuu). "This tremendous resource is. . . an excellent place to survey much of the available research on a topic related to data curation". - Julia Flanders and Trevor Muñoz. "An Introduction to Humanities Data Curation." In DH Curation Guide: A Community Resource Guide to Data Curation in the Digital Humanities, 2012.
Article
Knowledge, as it has been shaped in the United States, is grounded in whiteness. As a result, maps and geospatial data can be particularly harmful in perpetuating historically and experientially inaccurate narratives of space. As stewards of knowledge, librarians are uniquely positioned to implement policies advancing antiracist practices. The following paper analyzes diversity, equity, and inclusion (DEI) in cartographic collection development, metadata, and instruction, and discusses the opportunities for librarians to employ critical theory in their cartographic and geospatial library praxis.
Article
Preface The Research Data Curation and Management Bibliography includes over 800 selected English-language articles and books that are useful in understanding the curation of digital research data in academic and other research institutions. The "digital curation" concept is still evolving. In "Digital Curation and Trusted Repositories: Steps toward Success," Christopher A. Lee and Helen R. Tibbo define digital curation as follows: Digital curation involves selection and appraisal by creators and archivists; evolving provision of intellectual access; redundant storage; data transformations; and, for some materials, a commitment to long-term preservation. Digital curation is stewardship that provides for the reproducibility and re-use of authentic digital data and other digital assets. Development of trustworthy and durable digital repositories; principles of sound metadata creation and capture; use of open standards for file formats and data encoding; and the promotion of information management literacy are all essential to the longevity of digital resources and the success of curation efforts.1 The Research Data Curation and Management Bibliography covers topics such as research data creation, acquisition, metadata, provenance, repositories, management, policies, support services, funding agency requirements, open access, peer review, publication, citation, sharing, reuse, and preservation. It is highly selective in its coverage. The bibliography does not cover conference proceedings, digital media works (such as MP3 files), editorials, e-mail messages, interviews, letters to the editor, presentation slides or transcripts, technical reports. unpublished e-prints, or weblog postings. Most sources have been published from January 2009 through December 2019; however, a limited number of earlier key sources are also included. The bibliography has links to included works. URLs may alter without warning (or automatic forwarding) or they may disappear altogether. Where possible, this bibliography uses Digital Object Identifier System (DOI) URLs. DOIs are not rechecked after initial validation. Publisher systems may have temporary DOI 3 resolution problems. Should a link be dead, try entering it in the Internet Archive Wayback Machine. Abstracts are included in this bibliography if a work is under a Creative Commons Attribution License (BY and national/international variations), a Creative Commons public domain dedication (CC0), or a Creative Commons Public Domain Mark and this is clearly indicated in the publisher’s current webpage for the article. Note that a publisher may have changed the licenses for all articles on a journal’s website but not have made corresponding license changes in journal’s PDF files. The license on the current webpage is deemed to be the correct one. Since publishers can change licenses in the future, the license indicated for a work in this bibliography may not be the one you find upon retrieval of the work. Unless otherwise noted, article abstracts in this bibliography are under a Creative Commons Attribution 4.0 International License, https://creativecommons.org/licenses/by/4.0/. Abstracts are reproduced as written in the source material. 1 Christopher A. Lee and Helen R. Tibbo, "Digital Curation and Trusted Repositories: Steps Toward Success," Journal of Digital Information 8, no. 2 (2007). https://journals.tdl.org/jodi/index.php/jodi/article/view/229
Book
Full-text available
This selective bibliography presents over 800 English-language articles and books. It covers topics such as research data creation, metadata, provenance, repositories, management, policies, support services, funding agency requirements, open access, peer review, publication, citation, sharing, reuse, and preservation. It is also available as a paperback PDF file (https://www.digital-scholarship.org/rdcmb/rdcmb.pdf) and a website (https://www.digital-scholarship.org/rdcmb/rdcmb-web.htm), which includes a Google Translate link. Most sources were published from 2009 through 2019. It includes full abstracts for works under certain Creative Commons Licenses. This work is licensed under a Creative Commons Attribution 4.0 International License. (See also the Research Data Sharing and Reuse Bibliography and the Research Data Publication and Citation Bibliography.) Keywords: academic libraries, altmetrics, data citation, data curation, data journals, data preservation, data privacy, data publication, data repositories, data reuse, data sharing, data sharing policies, Digital Object Identifiers, peer review, ethical data sharing, geospatial data, funding agency requirements, open access, open access journals, open science, persistent identifiers, research data, research data management, research data metadata, research data publishing, research data metadata, research data services, research data training, research libraries, scholarly journals, scholarly metrics, and scholarly publishing.
Book
Full-text available
This bibliography presents over 650 English-language articles, books, and technical reports. It covers digital curation and preservation copyright issues, digital formats (e.g., data, media, and e-journals), metadata, models and policies, national and international efforts, projects and institutional implementations, research studies, services, strategies, and digital repository concerns. Most sources were published from 2000 through 2011. It is available as a EPUB file, a low-cost paperback, a paperback PDF file, a website with a Google Translate link, and a website PDF with live links (http://digital-scholarship.org/dcbw/dcb.htm). It is under a under a Creative Commons Attribution License. "Librarians and scholars who are concerned with managing digital resources and preserving them for future use will find a crash course on the subject in this bibliography. . . . This book is recommended for librarians working with original digital resources, scholars interested in digital repositories, and students in the field." - Paul M. Blobaum, Journal of the Medical Library Association 101, no. 2 (2013): 158.
Chapter
We explore the characteristics of different user groups for legacy geodata from the perspective of a long term archive. For the sake of this study, legacy geodata has been defined as all digital information useful for map creation including aerial photography, digital elevation models, LIDAR data, vector data bases etc., of which there exists at least one more recent version with the same characteristics. In the context of the ISO standard for open archival information systems (OAIS) potential user groups are called designated communities. The archive is supposed to adapt its service to their profiles and needs, which in the electronic environment includes taking into account their level of knowledge of technical aspects. A future technique, more precisely a Delphi study, has been used to predict the potential user groups and their need for geodata versions. In two rounds, two international Delphi groups have been questioned about user professions, frequency of access, amount of data needed, knowledge of GIS, age of the data they are interested in, preferred data set, scales, snapshot intervals and file formats. The answers allowed us to identify the following user types: geophysicists, commercial users, lawyers, policy makers, emergency response planning teams, architects and geo-related engineers, social scientists, the general public, archaeologists, historians, culture and arts professionals, conservation agents of the built and the natural environment, geodata creators and undergraduate teachers and students. We classified the user types by their characteristics into six clusters. The application of the user profiles showed that the method did not deliver sufficiently detailed answers for complying with all OAIS requirements, but that it was effective for gathering user characteristics which guide archives in strategic decisions about the designated communities they might serve.
Article
A collection policy (or a collection development policy) is a document that archival institutions must have to allow them to build their collections legitimately and consistently. Digital archives often lack sustainable financial and systematic supports. Thus, it is especially important for digital archives to have a policy for consistent collection activities. Digital archives have a different set of characteristics from physical records, and these characteristics should be considered in a collection policy. This study was initiated to create such a policy for the No Gun Ri Digital Archive. It reviewed existing literature for collection policies in archives and digital archives. Moreover, it examined several cases of digital archives and their policies to identify the necessary elements as well as the legal and procedural coverages for digital content and services. Furthermore, it studied the unique characteristics of the No Gun Ri incident and the No Gun Ri Digital Archive. Based on such investigation, a collection policy for the No Gun Ri Digital Archive was suggested. It is believed that suggesting a practical collection policy will provide a useful precedent for future digital archive projects.
Chapter
Geographic information (GI) creators, users, and stakeholders exist across nearly all communities, domains, and sectors. Geographic education varies in deployment and delivery for all. Formal training related to GI creation may not emphasize the organization, access, and use aspects of digital curation. Conversely, the existing programs that teach organization, access, and use focus on other information and seldom include coverage of GI. The purpose of this chapter is to outline both the history, current academic landscape, and pave a path forward for educating the different GI-related occupations. We present a multidisciplinary approach that led to the development of one curriculum. The chapter concludes with a call to develop a twenty-first Century GI workforce by coordinating across existing curricular scaffolds from K-12 to graduate programs.
Article
Geospatial data service units in higher education are facing challenges from collection budgets, staff shortages, rapidly evolving data manipulation technologies, and increasing research and learning interests. Many units have adopted a user-centered approach to address these issues. The core of this approach is to understand what their users need. This study aims to answer this question by analyzing 455 geospatial data requests that were received and fulfilled at McGill University Library during the past two academic years. Results include which departments primarily need geospatial data, which data sets are requested most frequently, which geographic areas receive the most GIS research interest, and a distribution of request numbers over the study period. Recommendations are made about data discovery instruction and consultations, data organization and access, and data collection management based on this study’s results. Findings and recommendations may also be of use for other geospatial data units in a similar context to enhance their services.
Article
The management and curation of digital geospatial data has become a central concern for many academic libraries. Geospatial data is a complex type of data critical to many different disciplines, and its use has become more expansive in the past decade. The University of Idaho Library maintains a geospatial data repository called the Interactive Numeric and Spatial Information Data Engine for Idaho (INSIDE Idaho) to support the land-grant research and outreach mission of the university. INSIDE Idaho-enabled research projects show that curating geospatial data requires a flexible, continuing partnership with researchers. First, changing access formats and data publishing protocols lead to cases in which the data are frequently disseminated in new ways. Second, researchers’ changing expectations about data reuse, as well as their own practices, require a repository to be both responsive and communicative. A strong curatorial process involves enabling new research and applications to be developed from the repository's data collections. Third, the experience with INSIDE Idaho suggests building a culture of data management awareness at an institution can be as valuable a use of a library's time as working to curate the data; that is, developing the local culture improves the efficiency of the curation process, most notably in the case of metadata. The experiences described in this article support the idea that academic libraries must develop infrastructure, policies, skills, and relationships to manage and curate research data successfully.
Book
Full-text available
The Digital Curation and Preservation Bibliography 2010 presents over 500 English-language articles, books, and technical reports that are useful in understanding digital curation and preservation. It covers digital curation and preservation copyright issues, digital formats (e.g., data, media, and e-journals), metadata, models and policies, national and international efforts, projects and institutional implementations, research studies, services, strategies, and digital repository concerns. Most sources have been published from 2000 through 2010. "Librarians and scholars who are concerned with managing digital resources and preserving them for future use will find a crash course on the subject in this bibliography. . . . This book is recommended for librarians working with original digital resources, scholars interested in digital repositories, and students in the field." - Journal of the Medical Library Association. Citation: Charles W. Bailey, Jr., Digital Curation and Preservation Bibliography 2010 (Houston: Digital Scholarship, 2011), http://www.digital-scholarship.org/dcpb/dcpb2010.htm. Bailey, Charles W., Jr. Digital Curation and Preservation Bibliography 2010. Houston: Digital Scholarship, 2011. http://www.digital-scholarship.org/dcpb/dcpb2010.htm.
Article
The role of academic libraries in providing access to and preserving geographic information has seen considerable change. Some libraries are beginning to provide access to geospatial data using geospatial data catalogs and geospatial Web services. Academic libraries are interested in developing geospatial data catalogs, but this can be complicated. The technology is rapidly evolving and many have questions around what software to use and what staffing might be required. The American Library Association (ALA) Map and Geospatial Information Round Table (MAGIRT) Geographic Technologies Committee established the Spatial Data Subcommittee in September 2010 to investigate spatial data catalogs and provide recommendations to MAGIRT on technology, staffing needs, and similar issues needed to develop a spatial data catalog. The subcommittee conducted interviews with eleven geospatial data catalog managers at academic libraries that yielded findings showing the diversity of implementation and prioritization. Guidance is provided that will help academic libraries determine what the best approach is for their situation and needs on their campus.
Article
The National Geospatial Digital Archive is a collaborative project between the University of California at Santa Barbara and Stanford University. The project was funded by the Library of Congress through their National Digital Information Infrastructure and Preservation Program (NDIIPP). The goal of the collaboration was to collect, preserve, and provide long-term access to at-risk geospatial data. The project partners created preservation environments at both universities, created and populated a format registry, collected more than ten terabytes of geospatial data and imagery, wrote collection development policies governing acquisitions, and created legal documents designed to manage the content and the relationship between the two nodes.
Article
Geospatial information access continues to be central to the mission of geography and map libraries. Providing or facilitating access has been, and continues to be, a dynamic process in light of both technological change and policy challenges. While technological changes in providing access have gathered much attention in the literature, substantive discussions regarding policies and practices preventing or assisting information access have been lacking. Even more troubling is the fact that archiving digital geospatial information receives even less attention. This first paper reviews developments and trends with regard to digital geospatial libraries, as this concept has become the primary metaphor by which access is measured. The second paper will focus on international trends related to the effect of policy and practice in terms of promoting the sharing and use of geospatial information needed to bridge gaps in access. These comparative policy and practice perspectives are also needed in order to point to the true promise held by new technologies for sharing, exemplified by digital libraries designed for geospatial information.
Article
Users of geographic data may not be able to afford to pur-chase and implement a dataset that does not finally meet their needs. Therefore, metadata has a very important role in the information supply environment of geographic data. The development of national/local spa-tial data infrastructures recognizes the importance of metadata, as do the digital libraries providing spatial data. The new ISO 19115:2003 standard of metadata for geographic infor-mation is introduced briefly. In particular, geographic information can be made available as digital maps (or images) that are meant for visual use, or as datasets meant for computational use. Metadata for digital maps is closely related to the metadata elements for conventional maps and can be enhanced by providing a sample map with the data. The case of computational use of geographic data is more complex. There are sev-eral details that may appear crucial when determining the fitness of data for an intended use. Understanding the importance of the crucial factors in each use case requires professional skills from the users of metadata. [Article copies available for a fee from The Haworth Document Delivery Ser-vice: 1-800-HAWORTH.