Google News Archive Search — Where Are the Links to Content from Libraries?

Extra! Extra! Read All About It! “Explore History as it Happened: Google News Now Has Archive Search” Extra! Extra!

In my imagination I can see and hear the herald of the newspaper carrier on the street corner barking out this call. Except, Kids These Days would probably decry the use of dead trees to carry stale news and already be reading it on their PDAs and text-messaging each other on their cell phones. As it is, I found out about it through a story on Search Engine Watch (also found in Wall Street Journal and the U.K. Guardian and the New York Times) which itself touted Google’s “200 Year News Archive Search.” It is a nice service; I look at it, though, and have to wonder about the changing — if not outright diminishing — role of libraries as couriers of information. After all, couldn’t links to resources from the user’s local library be included right there next to the commercial article suppliers? If they could, why aren’t they? And what does it mean that they are not?

Script for Testing HTTP Referer Headers

I’ve just had the third occasion where in support of a user I suspect that user has a piece of software which is blocking or modifying the HTTP “referrer” header that comes normally with most interactions between a web browser and a web server. Rather than asking that user to run a complicated test I found elsewhere on the web, I whipped up a little ditty that tests for this with (hopefully) non-technical words and advice. At the bottom of this post is the source code for the script; feel free to take it and modify it for your own circumstances.

XTF and FEDORA — Comments from the Community

Some questions and observations that have come in through mechanisms other than blog comments on the analysis of the XTF/FEDORA integration. I’ve reproduced those here for the sake of completeness, but also be sure to go back to the first two entries in this series to read the comments there as well.

Indiana University’s Observations


As it turns out, Indiana University is considering much the same path. They have an existing FEDORA-based repository and a number of XTF projects that have been in development for a while. They, too, are looking to put these two technologies together and have a page on their project website with Digital Repository Architecture > Search”>IU’s observations of an XTF plus FEDORA (plus more!) combination.

Modifications to FreePress Recent Comments Plugin

For others that may find it useful, I’ve made two modifications to the FreePress Recent Coments plugin on DLTJ: one to strip out quoted material when using the Quoter plugin and a second to suppress pingback entries that result from links to material within the blog.

Code from the first came from a blog posting about how to get Quoter to work with a different recent comments plugin. It is slightly modified, though, with the use non-greedy wildcard (*?) in the middle. (It is possible to have more than one quoted section in a comment, and the original code would leave just the text beyond the final [/quote] tag.) The context-sensitive diff is:

Analysis of CDL’s XTF textIndexer to Replace the Local Files with FEDORA Objects

This is a continuation of the investigation about integrating the California Digital Library’s XTF software into the FEDORA digital object repository that started earlier. This analysis looks at the textIndexer module in particular, starting with an overview of how textIndexer works now with filesystem-based objects and ending with an outline of how this could with reading objects from a FEDORA repository instead.

XTF’s Native File System handler

Natively, XTF wants to read content out of the file system. The core of the processing is done in these two class files:

TextIndexer.java

CDL’s XTF as a Front End to Fedora

We’re experimenting pretty heavily now with the California Digital Library‘s XTF framework as a front-end to a FEDORA object repository. Initial efforts look promising — thanks go out to Brian Tingle and Kirk Hastings of CDL; Jeff Cousens, Steve DiDomenico, and Bill Parod from Northwestern; and Ross Wayland from UVa for helping us along in the right direction.

XTF into Eclipse How-To


As we get more serious about XTF, I wrote up a How-To document for bringing XTF into Eclipse so that it can be deployed as a dynamic web application. Let me know if you find it useful. Definitely let me know if you find it in error. We haven’t put a version of XTF into OhioLINK’s source code repository, but that might follow shortly.

A Known Citation Discovery Tool in a Library2.0 World

When it comes to seeking a full-text copy of that known-item citation, are our users asking “what have you done for me lately?” OpenURL has taken us pretty far when one starts in an online environment — a link that sends the citation elements to our favorite link resolver — but it only works when the user starts online with an OpenURL-enabled database. (We also need to set aside for the moment the need for some sort of OpenURL base resolver URL discovery tool — how does an arbitrary service know which OpenURL base resolver I want to use!) What if a user has a citation on a printed paper or from some other non-online form? Could we make their lives easier, too? Here is one way. (Thanks go out to Celeste Feather and Thomas Dowling for helping me think through the possibilities and issues.)

DLTJ under a New Theme

Okay — let’s try this again. The first time around didn’t go so well, so I went back to the basics and started with a new theme: what is running on DLTJ now is a modestly modified version of Barthelme version 1.2.2 by Scott Allan Wallick. The modifications include the insertion of the Extended Live Archives plugin on the front page. I’m pretty excited about this…I think it better answers the question of why someone would want to come to the home page of a blog — not to read a reverse chronological list of the authors thoughts but a mechanism for the reader to use to drill down to what they might be looking for (whether it be by category, by tag, or chronologically). Take a look at the home page and let me know what you think.

Integration announced for DPubS (e-journal publishing system) and FEDORA (digital object repository)

The August 2006 edition of “The DPubS Report” produced by Cornell University Libraries for the DPubS community announced work underway at the Penn State to bridge the worlds of DPubS and FEDORA. Here is the line from the newsletter:

--------------------------------------------------------------------------SOFTWARE DEVELOPMENT UPDATE--------------------------------------------------------------------------[...]NEAR-TERM SCHEDULED WORK[...]* Penn State is working on Fedora interoperability. The plan is tohave that capability in the September release, with a working versionfor testing in late August.

The newsletter goes on to say that the work will be made available under an open source license, so I for one can’t wait to see what it looks like and how we might apply it to our own needs.