Beyond Federated Search Redux

Posted on     7 minute read

× This article was imported from this blog's previous content management system (WordPress), and may have errors in formatting and functionality. If you find these errors are a significant barrier to understanding the article, please let me know.

It started with a post by Carl Grant on the Federated Search Blog: Beyond Federated Search – Winning the Battle and Losing the War?. I bookmarked this in Delicious and copied this extended quote from the text into the bookmark:

I’ve long argued that librarianship on top of digital information is about the authority/authenticity/appropriateness of the information provided to the user, as opposed to the overwhelming amounts of information available via other search tools that don’t provide that differentiation. In order to meet those tests, one thing that is clear is that libraries and librarians should never cede control to other organizations over the content they offer to their end-users. It doesn’t matter if that happens because the content providers fail to provide access via federated search, or whether the library has allowed third party organizations to determine what content they can access via a local index discovery tool. Ceding this control cripples the ability of a library to build unique and precise informational offerings that target the needs of their end-users.

This in turn got pulled into my FriendFeed stream and the ensuing discussion seemed too valuable to let sit there, so I'm creating this post with those replies and adding a little bit more of my own thoughts. (Since all of these were public comments, I believe it is good nettiquete to reproduce them here with attribution. If not, please let me know...particularly if you are one of the people quoted!)

Dorothea Salo was the first to post a comment:

1) We HAVE ceded control. So what do we do about that? 2) Authority/authenticity doesn't mean jack to the satisficing patron. Which is IMO most of them.

This was followed shortly by Deepak Singh:

That control is long gone. I think people do care about authority, but IMO, that will come from outside the library community, at least on the technology side.

Does everyone really think we have ceded control? I think we still have it; we just don't market it as an asset to the user like we should/could. It is "the discovery layer problem" that we are all trying to tackle. My take on it is that we should put all of the information we can into a unified index with a user interface as simple as Google but with the added advantage of improving relevance of results via fielded data and librarian vetting for authority/authenticity/appropriateness. I subscribe to the notion that federated search can't take us far enough ... that there is benefit in bringing together metadata for our vetted resources and expanding/enhancing the metadata. This added-value metadata comes in computing relationships and relevancy between records, attempting to apply uniform headings on records based on machine heuristics, and other tricks that can't be done in real time with small subsets of data that we get back through federated search interfaces.

Richard Ackerman then jumped into the conversation with an excellent point about misplacing focus on the user interface itself:

I think to some extent it doesn't matter if we've ceded control or not - we've been having this discussion/argument at my office - the tech architects' side being that, if we want to add value at all, we have to build a discovery layer anyway - which we will expose in many different places, including browser extensions - but once you build it, the cost to also show it as a searchbox on your website is low. In other words, it doesn't matter that "they won't come" - the website is free anyway since you need to build the underlying infrastructure if you ever want to have a hope of delivering enhanced services around content and metadata. I also think "search in this box, and discover far more of the millions of dollars of content that we license for you than if you search in THAT box (e.g. google)" has got to be a compelling argument... surely? Researchers, what do you think?

In a FriendFeed comment, I thanked Richard for reminding me about how this concept is more than the user interface. I "know" that -- it is the cornerstone of OhioLINK's discovery layer strategy -- but I haven't internalized it in my thinking very well.

And a follow-up from Dorothea:

Yes, we have ceded control. We cannot insist that any given vendor support either an API or a data-provision protocol. Until we CAN, we have ceded control of discovery. Yet one more reason to dump the vendors in favor of OA.

I think Dorothea's argument is stronger for supporting open source software than open access content. With an open source software solution, we can see the innards of the data and create the APIs we need to make extended use of that data. Richard also followed up on Dorothea's comment:

Dorothea, I keep hearing that story internally too - "oh won't it will be great when all the publishers are gone and the library can be the Temple of OA". The whole point of OA is that anyone can have it, anywhere. Considering we do a terrible job of helping our users find content that is licensed from a few huge publishers, we're going to do better when content is scattered all over the place? Since it's OA, what's to stop a thousand startups from loading it all on their local harddrives? Doesn't OA just take the problem from libraries (who pay millions of dollars to license content) doing a terrible job even though they have a perfect right to intermediate access, to libraries (who pay nothing for OA content) trying to out do *every other web search engine on the entire web, a battle which we were never in and lost long ago*? OA doesn't make things better, it make them much, much worse for libraries. (and as always, I mean pure special/research libraries, not public libraries or unis)

William Gunn posted a comment:

It's a compelling argument, but I haven't seen an implementation that lives up to the promise. Pubmed's "searching pubmed for X will give Y results" message that it shows people arriving there via google search is the closest thing I've seen that actually shows more value in searching using their interface than using Google. Most in-library search functions I've seen (admittedly not many) are woefully bad, but they are probably the third-party things Dorothea is railing about.

I didn't post this comment to FriendFeed, but I agree with William's assessment. In fact most of the innovation in end-user interfaces is coming out of the libraries themselves, public and academic, and not coming from the traditional vendor community. I'm thinking of projects like VuFind, Haithi Trust, the OLE Project, and others. There are some notable exceptions to this -- the demonstration of Serials Solutions Summon, for instance, at ALA Midwinter is one example. But on the whole I think libraries are putting sweat equity into evolving or recreating their digital presence.

Dorothea followed up on Richard's comment:

I guess it depends on where the money turns out to be. I'm actually not all that troubled about libraries getting pushed out of the discovery business; if it can be done better and cheaper elsewhere, fine and dandy. If games start being played, libraries can get back in and compete as long as everything's still OA. We libraries suck enough at this that I think it's something we should stop doing.

I replied to Dorothea that I think we need to stick with the discovery end of the trade until the context sensitive linking -- e.g., get the user to the appropriate copy -- is better. What I don't want to end up with is giving up on the discovery layer to the point where users aren't coming to content that we have paid for on their behalf. Perhaps that will be the day when everything is open access, but can't hold my breath that long.

The last word at the moment goes to Dorothea:

Understood and agreed, Peter. Though I have days where I wish we'd just tell 'em "if I can't get my patrons easily and quickly to your stuff, it is WORTHLESS, ergo I will no longer pay for it." That doesn't necessarily have to mean OA, of course.

That is the conversation so far. Do you have any thoughts? Please add them here or on the original FriendFeed post. I should note that the WordPress plug-in I was using to shuttle comments between DLTJ posts and FriendFeed isn't working at the moment, so I may need to edit this post with interesting comments that come from FriendFeed (and vice versa).

The text was modified to update a link from http://friendfeed/mndoci to on January 28th, 2011.