A Vision for FEDORA’s Future, an Implementation Plan to Get There, and a Project Update

Posted on 5 minute read

× This article was imported from this blog's previous content management system (WordPress), and may have errors in formatting and functionality. If you find these errors are a significant barrier to understanding the article, please let me know.

This morning, Sandy Payette of Cornell University and FEDORA project co-director, gave an update on the FEDORA project including a statement of a vision for FEDORA's future, information about the emerging FEDORA Commons non-profit, and a status report/roadmap for the software itself. Below is a summary based on my notes of Sandy's comments and slide content.

Vision for FEDORA's Future

From her perspective, Sandy sees many kinds of projects using FEDORA, and she sees them fall into these general categories: Scholarly Workbenches — capturing, managing and publishing the process of scholarship; Linking Data and Publications — complex objects built up of relationships with different types of internal and external objects; Reviews and Annotations of Objects — blogs and wikis on top of information spaces; collaborations surrounding a repository object; and Museum Exhibits with K-12 Lesson Plans.

Based on these observations, she can envision the evolution of FEDORA as an open source software for building robust information spaces with these major components:

  • repository services: manage, access, versioning, and storage of digital objects
  • preservation services: repository integrity checking, monitoring, alerting, migration, and replication
  • process management services: workflow centered around digital objects and messaging with peer applications
  • collaboration services: annotation, discussion, and rating of digital objects

The collaboration services suite has not been part of the core FEDORA project to date. Other people have found clever ways to put services such as blogs and wikis on top of a FEDORA content repository, but there are functions that can be put into the FEDORA system that can enable and enhance collaborative services.

FEDORA, of course, does not exist in isolation from other activities on the internet, and there are implications of what is commonly called "Web 2.0" on the FEDORA system. The key theme of Web 2.0 is an "architecture of participation:" the capability to remix and transform data sources — building on top of objects that already exist — to harness collective intelligence of the community. Some specific examples are collaborative classification (Del.Icio.Us), content sharing and feedback (YouTube), power of collective intelligence (Wikipedia, amazon reviews), and alternative trust models (such as ebay — one based on reputation). This emergent behavior is influencing upcoming generations of scholars and scientists; they will have a completely different expectations regarding the technology they use for learning and research.

Taken as a whole, the vision for FEDORA is to enable "object-centric" collaborations. FEDORA is evolving into an open source platform that integrates a robust core (repositories and enterprise SOA) with dynamic content access (collaborative applications and web access/re-use). It is a technology for complex digital objects. As contrasted with a technology such as Wikipedia's MediaWiki — ideal for working with wiki-based resources — FEDORA is great for many different applications, including as a content store for wikis. In other words, one is not tied to one particular application or use case.

Fedora Commons Non-Profit

FEDORA as a project is evolving into FEDORA as an organization. That organization, called Fedora Commons, will be a non-profit to "to enable software developers to collaborate and create open source software (repository, services, tools) that enables information-oriented communities to collaborate in creating new forms of network-based knowledge objects (educational, scholarly, cultural) and that, ultimately, enables institutions to manage and preserve these information objects over time in a robust repository and service architecture." FEDORA Commons will be a custodian of the software platform and the means to steer its direction.

Structurally, it is envisioned as a 501c3 (as in the section of the IRS tax code) non-profit charitable organization. There is a proposal to the Moore Foundation being prepared to receive a grant for the initial start-up funds for the Fedora Commons focusing on sustainability and community building. The Commons may also seek matching funds from other foundations (Getty, Mellon) in later years until the organization is fully self-sustaining. The current thinking is that the Commons will achieve "steady state" with its own business model in 2010. The startup funds will extend the funding for the core development team as well as fostering a community of contributors to the project and committers to the code base. The plans include several funded positions: board of directors, executive director, technology architect (supervising sysadmin and build master as well as developers), a financial/accounting specialist, and a communications specialist.

Sustainability in this context means increasing the installed base of FEDORA as well as moving towards a community leadership model. One model is the Eclipse Foundation with four technical councils (collaboration, repository, enterprise, preservation) with corresponding community outreach councils. The community will also need do develop an income-generating model, be it corporate membership (dues structure like Eclipse) and/or university and government members.

Fedora Project Status Report

Fedora 2.2 was released on January 19th, and Sandy went through the major changes and features. First is FEDORA as a web application; it has been refactored and repackaged so it can now run in different (even existing) servlet/web containers. Along with this is a new installer application that steps one through the process of bringing up the software. There is a "Quick" option to get running immediately and a "Custom" option to set Fedora up optimally for a particular environment.

Within FEDORA itself, datastreams can now have checksums, and this is supported with new repository configuration options. This enabled trusted client/server collaboration and offers on-demand integrity checking of the repository. The manner in which it handles authentication has changed as well; version 2.2 uses servlet filters instead of Tomcat realms. This decouples FEDORA authentication from Tomcat. Three filters come with the core software: username/password file, LDAP, and Pubcookie.

FEDORA 2.2 also includes several modules from community committers: GSearch (configurable search for datastreams in FEDORA); Journaling (replication/recovery module for repositories); and MPTStore (new high-performing triplestore).

Sandy also covered the roadmap. The Mellon Phase 2 grant runs through 4Q2007 and the work remaining content models, content model dissemination architecture, basic messaging service, and preservation services. Next is "FEDORA Enterprise" (in the form of a grant proposal in front of Mellon now ending in 2Q2009) to include workflow engine and supporting tools, message-oriented middleware for an enterprise service bus (ESB), and distributed transactions. Finally, the FEDORA Commons 501c3 work (starting 3Q2007) in two parts: the technical (evolution of the integrated platform) and community building (foster development and outreach, evolving a business model, and tapping ongoing sources of funding).

[Updated 20070129T1655 to correct the section of the U.S. Tax Code in the last paragraph. I don't think we want anything to do with 26 USC 301c3.]