Why Fedora? Because You Don’t Need Fedora

I’m often asked “Why is OhioLINK using FEDORA?”  (Just to eliminate any confusion at the start, I’m referring to the FEDORA Digital Object Repository, a project of Cornell’s computer science department and the University of Virginia Libraries, and not the Linux operating system distribution by Redhat.)  There are many reasons, but I was reminded of one recently while reading through the migration documentation for the 2.1.1 release that came out today.

In case of corruption or failure of the repository, the Fedora Rebuild utility can completely rebuild the repository by crawling the digital object XML source files that are stored on disk.

The fedora-rebuild command launches the interactive Fedora Rebuild utility that restores the repository if it somehow became corrupted.   Symptoms of repository corruption include underlying indexes or registries becoming unusable, or the server refusing to start.   The components of the repository that can become corrupted are the SQL relational database and the RDF triplestore that underlie the Fedora repository service.   The SQL database (e.g., MySQL, McKoi, or Oracle) contains a set of registries, as well as metadata to enable simple searching of the repository,  and a cache of digital object profiles to speed up API-A access to the repository.  The triplestore contains RDF triples for key properties of digital objects, datastreams, disseminations, and relationships to create an RDF-based index of the repository (used for more advanced RDF-based searching).

http://www.fedora.info/download/2.1.1/userdocs/server/cmd-line/index.html#rebuild

Translation?  If your Fedora system blows up — software glitch, disk failure, or heaven forbid a logic bomb — you can restore the entire thing by simply reading files off the disk.  Yep — that’s right.  That large and complex software package only optimizes access to the objects.  The digital objects themselves are stored in a METS-like package called Fedora Object XML (FOXML).  All of the metadata (descriptive, preservation, and relationship to other objects) and managed datastreams that make up a digital object are “serialized” to a single XML file on a file system.  Backup those XML files, and you’ve just created a preservation copy of your entire system.  Don’t like Fedora — or in five years something better comes along — just program the new system to read these METS-like packages, load them into your new system, and away you go.  And, as the above quote from the documentation suggests, if something bad happens to the repository software or database, don’t fret — your worse case scenario is to wipe the system clean, restore the FOXML packages from backup, run the rebuild script, and away you go.

From a preservation perspective, what could be better for the long-term health of your content than a digital object repository system that doesn’t require that you ever use that system at all?

The text was modified to update a link from http://comm.nsdl.org/pipermail/fedora-users/2006-April/001505.html to http://sourceforge.net/mailarchive/forum.php?thread_name=443A7F13.2050605%40virginia.edu&forum_name=fedora-commons-users.

(This post was updated on 27-Oct-2010.)