Technical ReportPDF Available

Abstract and Figures

Anno 2017 the task of mobilizing data from biocollections ahead of us is still enormous (data of 90% of the biocollections still needs to be mobilized). It is imperative for stakeholders, individual keepers of natural science collections, the community at large, and even for funding agencies, not only to tackle this backlog as quickly as possible, but do it in the best possible order. To establish the best possible order for digitizing biocollections a demand driven framework is required based among others on criteria used to digitize biocollections.
Content may be subject to copyright.
A preview of the PDF is not available
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
In 2010, Naturalis Biodiversity Center started one of the largest and most diverse programs for natural history collection digitization to date. From a total collection of 37 million specimens and related objects, 7 million relevant objects are to be digitized in a 5-year period. This article provides an overview of the program and discusses the chosen industrial production line approach, the applied method for prioritization of collections that are to be digitized, and some preliminary results.
Article
Full-text available
Access to digitised specimen data is a vital means to distribute information and in turn create knowledge. Pooling the accessibility of specimen and observation data under common standards and harnessing the power of distributed datasets places more and more information and the disposal of a globally dispersed work force, which would otherwise carry on its work in relative isolation, and with limited profile and impact. Citing a number of higher profile national and international projects, it is argued that a globally coordinated approach to the digitisation of a critical mass of scientific specimens and specimen-related data is highly desirable and required, to maximize the value of these collections to civil society and to support the advancement of our scientific knowledge globally.
Article
Full-text available
Digitizing the information carried by specimens in natural history collections is a key endeavor providing falsifiable information about past and present biodiversity on a global scale, for application in a variety of research fields far beyond the current application in biosystematics. Existing digitization efforts are driven by individual institutional necessities and are not coordinated on a global scale. This led to an over-all information resource that is patchy in taxonomic and geographic coverage as well as in quality. Digitizing all specimens is not an achievable aim at present, so that priorities need to be set. Most biodiversity studies are both taxonomically and geographically restricted, but access to non-digitized collection information is almost exclusively by taxon name. Creating a “Geotaxonomic Index” providing metadata on the number of specimens from a specific geographic region belonging to a specific higher taxonomic category may provide a means to attract the attention of researchers and governments towards relevant non-digitized holdings of the collections and set priorities for their digitization according to the needs of information users outside the taxonomic community.
Article
Full-text available
Natural history collections represent a vast repository of biodiversity data of international significance. There is an imperative to capture the data through digitization projects in order to expose the data to new and established users of biodiversity data. On the basis of a review of the current state of digitization of natural history collections, a demand-driven approach is advocated through the use of metadata to promote and increase access to natural history collection data.
Article
Full-text available
Traditional approaches for digitizing natural history collections, which include both imaging and metadata capture, are both labour- and time-intensive. Mass-digitization can only be completed if the resource-intensive steps, such as specimen selection and databasing of associated information, are minimized. Digitization of larger collections should employ an "industrial" approach, using the principles of automation and crowd sourcing, with minimal initial metadata collection including a mandatory persistent identifier. A new workflow for the mass-digitization of natural history museum collections based on these principles, and using SatScan® tray scanning system, is described.
Article
Full-text available
This contribution explores the problem of recognizing and measuring the universe of specimen-level data existing in Natural History Collections around the world, in absence of a complete, world-wide census or register. Estimates of size seem necessary to plan for resource allocation for digitization or data capture, and may help represent how many vouchered primary biodiversity data (in terms of collections, specimens or curatorial units) might remain to be mobilized. Three general approaches are proposed for further development, and initial estimates are given. Probabilistic models involve crossing data from a set of biodiversity datasets, finding commonalities and estimating the likelihood of totally obscure data from the fraction of known data missing from specific datasets in the set. Distribution models aim to find the underlying distribution of collections’ compositions, figuring out the occult sector of the distributions. Finally, case studies seek to compare digitized data from collections known to the world to the amount of data known to exist in the collection but not generally available or not digitized. Preliminary estimates range from 1.2 to 2.1 gigaunits, of which a mere 3% at most is currently web-accessible through GBIF’s mobilization efforts. However, further data and analyses, along with other approaches relying more heavily on surveys, might change the picture and possibly help narrow the estimate. In particular, unknown collections not having emerged through literature are the major source of uncertainty.
Article
Full-text available
A survey on the challenges and concerns involved with digitizing natural history specimens was circulated to curators, collections managers, and administrators in the natural history community in the Spring of 2009, with over 200 responses received. The overwhelming barrier to digitizing collections was a lack of funding or issues directly related to funding, leaving institutions mostly responsible for providing the necessary support. The uneven digitization landscape leads to a patchy accumulation of records at varying qualities, and based on different priorities, ultimately influencing the data's fitness for use. The survey results also indicated that although the kind of specimens found in collections and their storage can be quite variable, there are many similar challenges across disciplines when digitizing including imaging, automated text scanning and parsing, geo-referencing, etc. Thus, better communication between domains could foster knowledge on digitization leading to efficiencies that could be disseminated through documentation of best practices and training.
  • M W Holmes
  • T T Hammond
  • G O U Wogan
  • R E Walsh
  • K Laberbera
  • E A Wommack
  • F M Martins
  • J C Crawford
  • K L Mack
  • L M Bloch
  • M W Nachman
Holmes, M.W., Hammond, T.T., Wogan, G.O.U., Walsh, R.E., LaBerbera, K., Wommack, E.A., Martins, F.M., Crawford, J.C., Mack, K.L., Bloch, L.M. and Nachman, M.W. 2016. Natural history collections as windows on evolutionary processes. Molecular Ecology, 25: 864-881.
Initiating a Natural History Collection Digitisation Project, Copenhagen: GBIF Secretariat
  • C K Frazier
  • J Wall
  • S Grant
Frazier, C.K., Wall, J. and Grant, S. 2008. Initiating a Natural History Collection Digitisation Project, Copenhagen: GBIF Secretariat. Available online at: https://www.gbif.org/document/80574/initiating-a-collection-digitisation-project.