XTF and FEDORA — Comments from the Community

Some questions and observations that have come in through mechanisms other than blog comments on the analysis of the XTF/FEDORA integration. I’ve reproduced those here for the sake of completeness, but also be sure to go back to the first two entries in this series to read the comments there as well.

Indiana University’s Observations


As it turns out, Indiana University is considering much the same path. They have an existing FEDORA-based repository and a number of XTF projects that have been in development for a while. They, too, are looking to put these two technologies together and have a page on their project website with Digital Repository Architecture > Search”>IU’s observations of an XTF plus FEDORA (plus more!) combination.

Modifications to FreePress Recent Comments Plugin

For others that may find it useful, I’ve made two modifications to the FreePress Recent Coments plugin on DLTJ: one to strip out quoted material when using the Quoter plugin and a second to suppress pingback entries that result from links to material within the blog.

Code from the first came from a blog posting about how to get Quoter to work with a different recent comments plugin. It is slightly modified, though, with the use non-greedy wildcard (*?) in the middle. (It is possible to have more than one quoted section in a comment, and the original code would leave just the text beyond the final [/quote] tag.) The context-sensitive diff is:

Analysis of CDL’s XTF textIndexer to Replace the Local Files with FEDORA Objects

This is a continuation of the investigation about integrating the California Digital Library’s XTF software into the FEDORA digital object repository that started earlier. This analysis looks at the textIndexer module in particular, starting with an overview of how textIndexer works now with filesystem-based objects and ending with an outline of how this could with reading objects from a FEDORA repository instead.

XTF’s Native File System handler

Natively, XTF wants to read content out of the file system. The core of the processing is done in these two class files:

TextIndexer.java

CDL’s XTF as a Front End to Fedora

We’re experimenting pretty heavily now with the California Digital Library‘s XTF framework as a front-end to a FEDORA object repository. Initial efforts look promising — thanks go out to Brian Tingle and Kirk Hastings of CDL; Jeff Cousens, Steve DiDomenico, and Bill Parod from Northwestern; and Ross Wayland from UVa for helping us along in the right direction.

XTF into Eclipse How-To


As we get more serious about XTF, I wrote up a How-To document for bringing XTF into Eclipse so that it can be deployed as a dynamic web application. Let me know if you find it useful. Definitely let me know if you find it in error. We haven’t put a version of XTF into OhioLINK’s source code repository, but that might follow shortly.

A Known Citation Discovery Tool in a Library2.0 World

When it comes to seeking a full-text copy of that known-item citation, are our users asking “what have you done for me lately?” OpenURL has taken us pretty far when one starts in an online environment — a link that sends the citation elements to our favorite link resolver — but it only works when the user starts online with an OpenURL-enabled database. (We also need to set aside for the moment the need for some sort of OpenURL base resolver URL discovery tool — how does an arbitrary service know which OpenURL base resolver I want to use!) What if a user has a citation on a printed paper or from some other non-online form? Could we make their lives easier, too? Here is one way. (Thanks go out to Celeste Feather and Thomas Dowling for helping me think through the possibilities and issues.)

DLTJ under a New Theme

Okay — let’s try this again. The first time around didn’t go so well, so I went back to the basics and started with a new theme: what is running on DLTJ now is a modestly modified version of Barthelme version 1.2.2 by Scott Allan Wallick. The modifications include the insertion of the Extended Live Archives plugin on the front page. I’m pretty excited about this…I think it better answers the question of why someone would want to come to the home page of a blog — not to read a reverse chronological list of the authors thoughts but a mechanism for the reader to use to drill down to what they might be looking for (whether it be by category, by tag, or chronologically). Take a look at the home page and let me know what you think.

Integration announced for DPubS (e-journal publishing system) and FEDORA (digital object repository)

The August 2006 edition of “The DPubS Report” produced by Cornell University Libraries for the DPubS community announced work underway at the Penn State to bridge the worlds of DPubS and FEDORA. Here is the line from the newsletter:

--------------------------------------------------------------------------SOFTWARE DEVELOPMENT UPDATE--------------------------------------------------------------------------[...]NEAR-TERM SCHEDULED WORK[...]* Penn State is working on Fedora interoperability. The plan is tohave that capability in the September release, with a working versionfor testing in late August.

The newsletter goes on to say that the work will be made available under an open source license, so I for one can’t wait to see what it looks like and how we might apply it to our own needs.

DLTJ page rendering updated (sorry Bloglines users)

Overnight I made several changes to the layout and rendering of pages on DLTJ — both in its web presentation and in its RSS presentation. The changes were really driven by the fact that my tags were not getting picked up in Technorati because the WordPress UltimateTagWarrior plugin was not including them as expected in the RSS feed (although it was in the web presentation, which was throwing me off). And I’ve heard from some (hi, Karen!) that when I make changes to the RSS rendering that Bloglines makes it look like all of the posts have been updated. This isn’t the case — just the rendering of them has changed. You’d think that Bloglines would read the “>pubdate<” tag and figure this out for itself, but it doesn’t. So to all of the Bloglines users, you can ignore all of the “new” posts from DLTJ except for this one — the content has not changed in those earlier posts.

Just In Time Acquisitions versus Just In Case Acquisitions

What of a service existed where the patrons selected an item they needed out of our library catalog and that item was delivered to the patron even when the library did not yet own the item? Would that be useful? With the growth of online bookstores, our users do have the expectation of finding something they need on the web, clicking a few buttons and having it delivered. When such expectations of what is possible exist, where is the first place a patron would go to find recently published items — the online bookstore or their local library catalog? Does your gut tell you it is the online bookstore? Would it be desirable if the patron’s instinct were to be the local library catalog?