OhioLINK is engaged in building a “trusted digital repository” on behalf of its membership. As we build it, we want to have an understanding of what “trusted” means, and so we are engaging in an audit process to assess whether we can claim to be trustworthy. This process is panning out to have four major phases:
- Research common and best practices for preservation.
- Evaluate the OhioLINK policies and processes against common and best practices.
- Perform a gap analysis between where we are now and where common and best practices suggest we should be.
- Propose and adopt policies and processes that get us closer to the ideal common and best practices.
This is a report at the end of phase 1. Earlier this year, two major reports were released that address how one measures a “trustworthy repository.” The two reports are summarized below, followed by a recommendation.
Trustworthy Repositories Audit & Certification: Criteria and Checklist
The first is the OCLC/CRL/NARA Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC:CC). This document “represents best current practice and thought about the organization and technical infrastructure required to be considered trustworthy and capable of certification.” Quoting again:
The nestor working group says a trusted, “long-term digital repository is a complex and interrelated system” (nestor 2006). However, more than just the “digital preservation system” drives the management of the digital materials. In determining trustworthiness, one must look at the entire system in which the digital information is managed, including the organization running the repository: its governance; organizational structure and staffing; policies and procedures; financial fitness and sustainability; the contracts, licenses, and liabilities under which it must operate; and trusted inheritors of data, as applicable. Additionally, the digital object management practices, technological infrastructure, and data security in place must be reasonable and adequate to fulfill the mission and commitments of the repository.
A trusted digital repository will understand threats to and risks within its systems. As articulated by Rosenthal et al. (2005), these potential threats include media failure, hardware failure, software failure, communication errors, failure of network services, media and hardware obsolescence, software obsolescence, operator error, natural disaster, external attack, internal attack, economic failure, and organizational failure. Constant monitoring, planning, and maintenance, as well as conscious actions and strategy implementation will be required of repositories to carry out their mission of digital preservation. All of these present an expensive, complex undertaking that depositors, stakeholders, funders, the designated community(ies), and other digital repositories will need to rely on in the greater collaborative digital preservation environment that is required to preserve the vast amounts of digital information generated now and into the future.
The TRAC:CC contains 84 criteria broken out into three main sections: Organizational infrastructure; Digital object management; and Technologies, technical infrastructure, and security. Within each of these sections are various subsections and under the subsections are the criteria themselves.
- Organizational infrastructure
- Governance & organizational viability
- Organizational structure & staffing
- Procedural accountability & policy framework
- Financial sustainability
- Contracts, licenses, & liabilities
- Digital object management
- Ingest: acquisition of content
- Ingest: creation of the archivable package
- Preservation planning
- Archival storage & preservation/maintenance of AIPs
- Information management
- Access management
- Technologies, technical infrastructure, and security
- System infrastructure
- Appropriate technologies
Some sample criteria are:
- A1.1 Repository has a mission statement that reflects a commitment to the long-term retention of, management of, and access to digital information.
- A2.2 Repository has the appropriate number of staff to support all functions and services.
- A3.5 Repository has policies and procedures to ensure that feedback from producers and users is sought and addressed over time.
- A5.4 Repository tracks and manages intellectual property rights and restrictions on use of repository content as required by deposit agreement, contract, or license.
- B2.5 Repository has and uses a naming convention that generates visible, persistent, unique identifiers for all archived objects (i.e., AIPs).
- B2.9 Repository acquires preservation metadata (i.e., PDI) for its associated Content Information.
- B3.4 Repository can provide evidence of the effectiveness of its preservation planning.
- B4.4 Repository actively monitors integrity of archival objects (i.e., AIPs).
- B5.3 Repository can demonstrate that referential integrity is created between all archived objects (i.e., AIPs) and associated descriptive information.
- C1.1 Repository functions on well-supported operating systems and other core
- infrastructural software.
- C1.5 Repository has effective mechanisms to detect bit corruption or loss.
- C1.7 Repository has defined processes for storage media and/or hardware change (e.g., refreshing, migration).
- C1.9 Repository has a process for testing the effect of critical changes to the system.
- C3.3 Repository staff have delineated roles, responsibilities, and authorizations related to implementing changes within the system.
TRAC:CC states that the “criteria are written to be applicable to any kind of digital repository or archives.” As such, criteria should be placed within the context of vision and goals of the Ohio DRC. One demonstrates compliance with the criteria through documentation (evidence), transparency (open examination of the evidence), adequacy (degree to which the evidence meets the vision/goals), and measurability.
Digital Repository Audit Method Based On Risk Assessment
The second major report was Digital Repository Audit Method Based On Risk Assessment (DRAMBORA). Written by the Digital Curation Centre (U.K. JISC-funded effort researching best practice for storage management and preservation of digital information) and Digital Preservation Europe (European Commission-funded project to improve coordination and cooperation among member states for digital preservation), DRAMBORA is a more methodical approach to assessing the trustworthiness of a repository. A systematic process guides the auditor to identify risks to long-term preservation of repository content, and then scores each risk as a product between the likelihood of the risk occurring with the impact associated with that event. Mitigation of the risks could then be prioritized based on a descending order of the score.
The process has six stages, some with multiple tasks:
- Identify organizational context
- Specify mandate of your repository or the organization in which it is embedded
- List goals and objectives of your repository
- Document policy and regulatory framework
- List your repository’s strategic planning documents
- List the legal, regulatory, and contractual frameworks or agreements to which your repository is subject
- List the voluntary codes to which your repository has agreed to adhere
- List any other documents and principles with which your repository complies
- Identify activities, assets and their owners
- Identify your repository’s activities, assets and their owners
- Identify risks
- Identify risks associated with activities and assets of your repository
- Assess risks
- Assess the identified risks
- Manage risks
- Manage the risks
The report includes a catalog of risks taken from other checklists and repository audits that can be used to spur the thinking of the auditor.
As other reviewers of these documents have noted, DRAMBORA takes a more quantified approach to assessing repositories. As such, I think it would work best for an established repository self-review. TRAC:CC is more open-ended and exploratory, taking into account vision and goals and plans for a repository. The authors of DRAMBORA estimate that it would take 28 to 40 hours to complete the audit; TRAC:CC does not provide an estimate, but I think its more general nature means that it would take less time.