Skip to content
Solely for the Purpose of Catching $PAMRZ

Google News Archive Search — Where Are the Links to Content from Libraries?


Extra! Extra! Read All About It! “Explore History as it Happened: Google News Now Has Archive Search” Extra! Extra!

In my imagination I can see and hear the herald of the newspaper carrier on the street corner barking out this call. Except, Kids These Days would probably decry the use of dead trees to carry stale news and already be reading it on their PDAs and text-messaging each other on their cell phones. As it is, I found out about it through a story on Search Engine Watch (also found in Wall Street Journal and the U.K. Guardian and the New York Times) which itself touted Google’s “200 Year News Archive Search.” It is a nice service; I look at it, though, and have to wonder about the changing — if not outright diminishing — role of libraries as couriers of information. After all, couldn’t links to resources from the user’s local library be included right there next to the commercial article suppliers? If they could, why aren’t they? And what does it mean that they are not?

How Libraries Could Be Included


OpenURL linking would seem to play a key role here. It presumes that Google has the key metadata for each article — ISSN of the periodical would be great to include in the OpenURL ‘query string’ for instance, but since that only goes back to the mid 1970s 1 the link resolver might have to do some fuzzy matching based on periodical title and so forth, but it probably ought to be doing that anyway. More to the point, though, Google already has a program to adopt the use of OpenURL link resolvers as part of the Library Links Program in Google Scholar. Whether through IP address recognition on Google’s part or by the user saving a link resolver preference in Google Scholar, hyperlinks to the patron’s home library link resolver appear within the search results of Google Scholar entries. It wouldn’t seem that hard to extend this capability to the new News Archive Search service.

(As an aside, wouldn’t the same thing be possible for the Google Books service as well? OpenURL can carry metadata about books as well as journals, so including a hyperlink to the library’s link resolver would seem to be a pretty easy thing to do.)

Why Libraries Aren’t Included: The Possible Reasons


So having come this far in thinking about the lack of automated links to library holdings, I can’t help but wonder about the reasons why. These are the possible scenarios that I came up with:

1. In Progress (”We’re Working On It. What? ‘Internet Time’ Isn’t Fast Enough For You?”)


The scenario goes something like this: Google is a modestly large company, and a couple of its developers have used their 20% self-directed time to create this new News Archive Search service. In another part of the company, the Google Librarian team has only just heard about it and have already been over talking with the first group about adding the OpenURL links. They’re working on it, and it’ll be in a push of new software that comes out soon. In other words, it didn’t occur to the first group of developers and they thought it was a really keen idea, but didn’t want to delay or stop the roll-out of the service while they worked on this new feature.

2. Benign Neglect (”Oh, You Mean Libraries Already Have Some of This Stuff?”)


Perhaps it hasn’t occurred to anyone, including the Google Librarian team, to put OpenURL links into the search results of the News Archive Search service. Sounds like a good idea, though, doesn’t it? If you think so, let them know!

A note from the News archive search team

Please let us know if you have suggestions, questions or comments about News archive search. Our goal with News archive search is to make it possible for users to read about historical events as they happened. History is often presented with a viewpoint of many years later and is frequently smoothed over in many ways and for many reasons. Enabling users to read about history as it unfolded allows them to explore and understand the past for themselves.

http://news.google.com/archivesearch/about.html

3. Clash of Business and Library Ethos (The “Follow the Money” Theory)


The first two scenarios are plausible enough, but the conspiracy theorist side of my personality wants to believe in this last scenario — there are dollar signs at the end of many of those search results and Google is claiming a portion of that article purchase price for referring the user to one of these external suppliers. This is the “Follow The Money” theory — the one that says that, first and foremost, Google is a business with investors that expect rising profits and to offer a link to a “free” supplier (or, “prepaid by the user’s library,” if you will) cuts into the revenue received from outside suppliers.

This is the prototypical clash between two very different value sets: that of a business seeking to maximize returns and profits, and that of a not-for-profit library seeking to economically acquire and present content to users at a cost — in aggregate — cheaper than the users could do for themselves.

I also can’t help but wonder about the ethos in play for some of the content suppliers that have partnered with Google to provide this service. From the Search Engine Watch article:

Google has partnered with news organizations including Time, The Wall Street Journal, The New York Times, the Guardian and the Washington Post, and aggregators including Factiva, LexisNexis, Thomson Gale and HighBeam Research, to index the full-text of content going back 200 years.

http://searchenginewatch.com/showPage.html?page=3623345

Whoa! Don’t many of us already do business with Factiva, LexisNexis, and Thomson Gale? (I don’t know about HighBeam Research — that name is new to me.) Wouldn’t they, too, have our IP address ranges and know about our patrons? In my limited amount of testing using an IP address associated with an institution that has access to products from these three vendors, I’m still asked to pay for article that my library has, in effect, already acquired.

And What Does It Mean That We Are Not?


This is a very interesting and important question. I have some ideas on how to answer it, but it is late in the evening and before going further I’d like to see if there are comments on this line of thinking so far. Which of the three scenarios seems most likely to you? Or is there one I missed? Or is the whole premise of OpenURL linking out of the News Archive Search results impossible to begin with?

[Updated 20060907T1038 to include links to mainstream media articles about the new service.]

Footnotes

  1. Based on the date of adoption as listed in the ISSN entry on Wikipedia. []
(This post was updated on 25-Oct-2006.)

6 Comments

  1. Thomas Dowling | September 7, 2006 at 7:31 am | Permalink

    One problem with linking to newspaper articles is that page-level details, and sometimes the date, may differ from edition to edition. A citation to the Podunk Daily Bugle early edition may be hard to track down in a database that archives the city final.

    Regardless, it’s well past time for Google either to make library content something other than an afterthought, or fess up to being a shill for pay-per-view content sites.

  2. the jester | September 7, 2006 at 6:47 pm | Permalink

    It looks like scenario #3, the Follow The Money one, may prove to be incorrect. Here is a paragraph from the end of this article from Business We

    ek

    :

    What’s more, publishers don’t have to share the wealth with Google. The search-engine company will receive no payment from publishers’ content fees, advertising, or supplying traffic. Search results will be ranked by relevance, without any influence from publishers. The results initially will be served without Google’s customary sponsored links on the right side of the page, and at the outset, Google won’t make money directly from the service.

    <

    /blockquote>

    So, assuming the difficulties that Thomas mentioned are not insurmountable, are we left with “in progress” or “benign neglect”?

  3. Tom Wilson | September 15, 2006 at 12:41 pm | Permalink

    Yes, I agree that it would be great to have links to library holdings through OpenURL, particularly since Google is doing this in other projects. But on the issue of access through other providers/publishers, is it clear that these other vendors have data going back 200 years already online?

  4. the jester | September 15, 2006 at 6:54 pm | Permalink

    But on the issue of access through other providers/publishers, is it clear that these other vendors have data going back 200 years already online?

    I hadn’t thought about it in quite that way, but I have to assume the answer is yes because Google has to have something to run through their OCR engine and indexes. Or was your question whether vendors that typically deal with the library community (e.g. Gale, Ebsco, etc.) have the content going back 200 years? We may need to forge alliances with other content providers, which admittedly does decrease the chance that something like OpenURLs in Google News Archive search results will actually happen.

  5. Tom Wilson | September 18, 2006 at 7:01 am | Permalink

    What I was focusing on is not so much do the vendors have the objects digitized, as are they offering it to subscribers currently. Vendors may have e-copies of things in e-form, perhaps for preservation purposes. That, in and of itself, does not mean that they make it available.

  6. the jester | September 18, 2006 at 10:03 am | Permalink

    Hmmm — interesting point. I’ve bought into the notion (hook, line and sinker) that the cost of electronic delivery is near zero, so if you do have content in digital form of some sort there is little incremental cost in making it available. Perhaps that is not a safe assumption?

1 Trackback

  1. The OPLIN 4cast » OPLIN 4cast #21 | September 12, 2006 at 3:00 pm | Permalink

    Kramer auto Pingback[...] Google News Archive Search - Where Are the Links to Content from Libraries? (Disruptive Library Technology Jester) [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *
Human Detection Scheme
(What's this?)
Comment Preview

Subscribe without commenting

From the Disruptive Library Technology Jester (http://dltj.org/), printed on Thursday the 13th of November 2008 at 2:07:28 PM EST (-0500). The URL to this page is http://dltj.org/article/google-nas-openurl/

[Creative Commons Logo] This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.