Content uploaded by Raymond John Uzwyshyn
Author content
All content in this area was uploaded by Raymond John Uzwyshyn on Feb 21, 2020
Content may be subject to copyright.
Content uploaded by Raymond John Uzwyshyn
Author content
All content in this area was uploaded by Raymond John Uzwyshyn on Feb 20, 2020
Content may be subject to copyright.
Content uploaded by Raymond John Uzwyshyn
Author content
All content in this area was uploaded by Raymond John Uzwyshyn on Feb 20, 2020
Content may be subject to copyright.
What is Library Digital
Preservation Storage?
•Simply put, Very long-term storage.
•The University Libraries, The Wittliff
Collections and University Archives
increasingly collect and gather digital
information, media and data.
•This data requires longer term storage
in line with Research library national
standards (ISO standards: 16363, 16919,
14721)and longer-term new millenia
archival perspectives
Digital Preservation in Research Libraries
follows a Unique Library Model
3-Legged Stool Model
•Organization
Leverages existing human resources in libraries to
build on their archival/stewardship expertise for the
digital age
•Technology
Synthesizes Technological Capabilities to meld with
Traditional Library Archival/Collection Preservation
Models
•Resources
Utilize Both Library Human Resources and Library
Network resources.
Anne Kenney/Nancy McGovern, 2007
Unique Characteristics of
Long-Term Digital Preservation •Migration and Preservation of
Formats for Long Term Storage
(Normalization)
•Risk Mitigation for Data and
Content. Multiple bit-level copies,
stored in disparate locations
geographically, administratively, and
technologically.
•Leverages the libraries’ role and
in academic environments as
keeper of the scholarly record in
a digital sphere
Texas State University Libraries
Digital Preservation Working Group
Background & History
•Formed 2015 and consists of members of Libraries Digital
and Web Services (Digitalization Lab, Institutional
Repositories) University Archives, Wittliff Collections, Library
General Collections
•Group began by investigating and then authoring the
Libraries’ first Digital Preservation Policy Document (August
2016), benchmark minimums for preservation Masters etc.
•Created Dedicated Local Server Space for Preservation Files
and Use Files with TR
•Opened and Developed an ongoing relationship with
Windows Team (Todd)
2016-2018 New Digital Preservation Tools,
Platforms and Resources Became Available
•Archivematica: Middleware standard for Digital
Preservation Metadata and Integrity
•Archivematica bundles micro-services for normalizing files, managing
metadata and verifying file types, bit-level integrity (checksums) etc.
•Arch
•Texas State Began R&D with Archivematica on Linux
Ubuntu and first deployed production level instance
on a new Archivematica Linux Red Hat platform
•University Archives and Wittliff Collections began
experimenting with, learning and utilizing Software
•All areas gained expertise in Metadata/middleware
workflow process (Archivematica) to create AIP’s
(Archival Information Packages) to safely store,
archive and retrieve files and metadata for later use
Digital Preservation Group
Conducted Initial Digital
Storage Needs Estimate (2016)
•Conclusions: 10-12 TB/year for all access files needed
(Not permanent Digital Storage, requiring now 60-70 TB)
•University Archives:
•Thesis project: 500 GB per year
•Yearbook/Football negatives: 235GB per year
•San Marcos Daily Record Negatives 1500 GB per year
•Audio digitization: 500 GB per year.
•Misc imaging: 500GB per year
•Wittliff Collections:
•Unique digitization projects. Lonesome Dove Dailies (20 TB), Powers (10 TB) , Broyles (300
GB). Jerry Jeff Walker 2# reel tapes .
•O’Connor Collection/New Major Donation example (2TB).
•Austin Film Festival: 1.5 TB per year, (2+ years).
•Misc imaging: 2 TB per year
•Audio digitization: Wittliff: 200 GB / year
•General Collections:
•Streaming media archive: 2 TB per year, General Collections (Covered by LOCKSS, PORTICO
Memberships)
2016-2018 Texas Digital Library
Forms First State Digital Preservation
Resource Infrastructure
•2016 TDL Preservation Services Initiated
(Hires Courtney Mumma to Focus on
State Digital Preservation Services
•2016 TDL Forms Alliance with DuraCloud
(Digital Preservation focused Non-Profit
Duracloud @ TDL )
•2017 TDL Creates Digital Preservation Services
Members receive “Space” in DuraCloud@TDL for
ingesting content, based on membership level.
•2018 Texas wide TDL Archivematica Users Group
Formed
2018-2019
Digital Preservation Working Group
Storage Recommendation Charge
Charge
Methodology
•Conduct Environmental
Scan: to Identify Library Digital
Preservation Storage Options
•Compare Texas Peer Groups
(TDL) and National Best
Practices for Research Libraries
•Narrow The Focus to
pragmatic options suitable for
University Libraries Needs
•Forward Recommendation:
for AVP and VPIT Review and
Approval
2019
Digital Preservation
Storage Focus
•Investigation begins into various Historic,
Library Centered, University and
Commercial Solutions
•Growing recognition of permanent digital
preservation storage needs
•Growing recognition that Resource
Possibilities are maturing and widely
available both commercially and in the
library space
•Possible solutions ranged from new to
previous model and In-House to
Outsourcing possibilities
Environmental Scan
Digital Preservation Solutions (Peer Institutions)
Texas Peer
Institutions
University of
Texas at San
Antonio
University of
Houston
UT Rio
Grande
Valley
University
of Texas
(Austin)
Texas A & M
University
Digital
Preservation
Solutions
Duracloud
Directly (not via
Texas Digital
Library, TDL)
Amazon S3
and Glacier
Directly (Not
via Texas
Digital Library,
TDL)
Chronopolis
via DuraCloud
through TDL
LTO Tape,
moving to
Texas
Advanced
Computing
Center
Chronopolis
and Amazon via
Duracloud
@
TDL
Three Final Candidates for Texas State
University Preservation Storage
Option 1: Outsource Preservation Digital Storage
•Preservica
Option 2: In-House Texas State Data Center Solution
•files.txstate.edu
Option 3: Duracloud through Texas Digital Library
Options
•AmazonS3
•Amazon Glacier
•Chronopolis
Option 1: Outsource
(All in One Outsource Option, Preservica)
Benefits
Considerations
Preservica
creates AIP’s
(Archival Information Packages,
Metadata) and provides all
technology set
-up and support
Costs: $35,000.00/year for
20TB
Established Archival Best
Practices
No local control or entrance to
underlying technology (black
box)
Recognized Library Peer and
Community of Practice
Variable Response to Local
Needs (similar considerations
to @mire)
Option 2: In House
Expand TR/Texas State Data Center Relationship
Benefits
Considerations
Proven relationship with TR.
Does not meet requirements for geographic,
administrative and technological distribution
(even if multiple copies)
Storage for working files, access copies,
preservation files and associated metadata
Specialization not in place: Metadata
Infrastructure, Normalization of Various
Formats, library
-related expertise or best
practices for this type of Digital Preservation
Building on our current temporary solution of
files.txstate.edu and increasing capacity. Growth
estimate of 10
-12 TB/year
30
-day window for recovery is currently not
sufficient for maintaining preservation files
Option 3:
Duracloud through TDL
(Texas Digital Library)
to Chronopolis Option
Chronopolis: Geographically Distributed
Preservation Network
•UC San Diego
•National Center for Atmospheric Research
•University of Maryland, Institute for Advanced Computing
Studies
•Texas Digital Library/TACC
Benefits
Considerations
Geographic Distribution at any
3 technologically diverse
partner nodes
Subscription cost:
$2500 annual fee includes
2TB/year storage and ingest
$1000 initial setup (1st year
only)
Non
-Commercial solution
rooted in libraries and cultural
heritage community
Storage $165/year/additional
TB
$120 ingest fee/additional TB
Library community of practice
around this
(TDL/
Duracloud/Chronopolis)
Significant Human resources/time
investment for initial technological
integration
File Fixity and Data Integrity
processes are transparent
Option 3: Duracloud Through
the Texas Digital Library (TDL)
•Duracloud is a hosted middleware service from
DuraSpace that lets organizations control where and
how digital content is preserved.
•The parent organization Duraspace is a non-profit
organization providing academic library leadership
for open source technologies focused upon durable,
persistent access to digital data. (i.e. Fedora,
Dspace).
•Currently, Duraspace is part of Lyrasis, a
longstanding library related organization supporting
libraries and technology initiatives
Option 3: Duracloud Through
the Texas Digital Library (TDL)
•Duracloud would be administered through our TDL
membership with these consortial relationships,
advantages (usergroups, networks etc) and constraints
•The Texas Digital Library is a Consortial Organization
consisting of 22 Texas University Library Organizations
•Focused on enabling Texas Libraries Digital
Infrastructure and new digital technology Projects.
Option 3: Duracloud through TDL
Duracloud through TDL/Amazon S3 and Glacier Option
Benefits
Considerations
TDL possesses established community of
practice.
Part of
Duracloud Suite
Commercial: not tailored to cultural heritage
institutions. Does not meet requirements
for
geographic, administrative and technological
distribution
S3 suitable for streaming, dynamic
access or Glacier for long
-term dark
archive needs
File fixity and data integrity is a black box (process
hidden from owners)
Subscription cost
$2500 annual fee includes 2TB/year
$1000 initial setup (1st year only)
S3 $265/year per additional TB
Glacier $50 / year per additional TB
HR/Time Investment for Initial Technological
Integration
Digital Preservation
Storage Working Group
Final Recommendation
Chronopolis via DuraCloud
through TDL (Texas Digital Library)
•Provides strong library support through four
academic library focused organizations
(Chronopolis, Duraspace, TDL, Lyrasis) for long
term viability and peer support networks
•Anticipated Budgetary Request:
•Year 1: $3500.00 ($2500.00 TDL
Preservation/year, $1000.00 Initial Set-
up/Onboarding, Includes 2 TB Storage)
•Year 2-3: $2785.00/year (includes
additional 1 TB storage/year)
•Review Storage and Staff Needs Annually.
Deeper Rationale
For Long Term
Digital Preservation
Storage Infrastructure
•New Level of Service Expected by Donors,
Researchers, Faculty and students.
•Present Area of Focus for Research Libraries
•Connects Library with many State and National
Library Technology Organizations focused on these
Issues (TDL, Texas Digital Library, CNI, Coalition of
Network Information, JISC, LITA Library Information
Technology Association, Chronopolis, Duraspace)
•Places Texas State Libraries in Line with institutions
we have joined and are aspiring towards (GWLA,
Greater Western Library Association and ARL,
Association of Research Libraries)
Questions?