Skip to content
Solely for the Purpose of Catching $PAMRZ

Analysis of PubGet — An Expedited Fulltext Service for Life Science Journal Articles

In June, a new service that speeds access to life sciences literature reached a milestone. Called PubGet, it is a service that reduces the number of clicks to the full text of an article, and the milestone was activating the 50th institution using its service. Using its own proprietary “pathing engine”, it links directly to the full text on the publisher’s website. PubGet does this by understanding the link structure for each journal of each publisher and constructing the link to the full-text based on information from the citation. The PubGet service focuses on the life sciences journals indexed in PubMed — hence the play on names: PubMed to PubGet.

How It Works


OLINKS screen for Christensen article

Link Resolver screen for Christensen article

EBSCOhost screen for Christensen article

EBSCOhost screen for Christensen article

Typical View of a PubGet Article Display

Typical View of a PubGet Article Display

In a typical interaction, a user would start at a web page with a journal article citation that has a link to the user’s OpenURL resolver. Contained in that link is the citation metadata that identifies the specific article. Clicking on that link takes you to the OpenURL resolver web page for that specific article. That web page contains links to any online versions of the article, and might also include links to library catalog records for physical copies, and options to search for similar articles. An example of one of these pages is this one from my place of work for an article by Clayton Christensen in the Harvard Business Review. (When you are coming from an OhioLINK member institution, it looks like the screen image to the right.) Clicking on that link that says “Full text of this article at EBSCO” takes you to yet another page — this time from EBSCOhost — that has the citation data again and the options for viewing or taking other actions on the article. Once there it is one more click to the HTML or PDF full text of the article. From the perspective of the creators of PubGet, that is two clicks and two screens too many. PubGet’s pathing engine knows about the structure of links on the publishers website, and so it creates a link directly from the citation in the search results list to the article PDF.

The pathing engine is one of three components that make up the service. The other two are a search engine and a personalization feature. The search engine indexes the citation and abstract fields; it is not nearly as sophisticated as the thesauri-driven search engine native to PubMed, but it does the job for cases when you have a known citation. The personalization feature allows you to tie your account on PubGet to an institution, and with that knowledge the PubGet service can know exactly what digital rights your institution has for each journal and can create links to the full-text article that go through your institution’s proxy server. The account system also enables you to have new articles matching your search criteria sent to you and to mark articles in the search results for later bulk downloading (via a Firefox plugin).1

Thinking About PubGet in a Wider Information Ecosystem


One quandary I have with PubGet is that it bypasses OpenURL as the open standard for linking to full-text content. In order to take advantage of PubGet’s unique characteristic — the pathing engine to get straight to the article text — you need to start at the PubGet site itself in order to get the direct URLs to the articles. This is a pretty significant downside to the service. You can’t get the pathing engine along with the powerful PubMed search engine.

It would be nice if PubGet could be set up as an OpenURL target, and when it receives a request translates it to the direct link to the full-text using its pathing engine. That way I could set up PubGet as the OpenURL resolver in my PubMed account, and the article links in PubMed would automatically go to the full text. I don’t know how this would work as a business model for PubGet, though, because as an OpenURL resolver is this manner, it makes the PubGet website invisible — through a series of browser redirects I’d go from PubMed to PubGet to the publisher site. (If the point is to make money selling advertising for related industries, a configuration that completely by-passes any visible signs of PubGet would cut into that revenue source.)

As I was sharing background on PubGet with Thomas Dowling, a colleague at OhioLINK, he pointed out something I didn’t know about OpenURL: it is within the standard to specify a “service type” in the OpenURL Context Object. Section 5.1 of the NISO standard for OpenURL says a service type is “The resource that defines the type of service (pertaining to the Referent) that is requested.” And there are indeed service types registered as part of the SAP2 Community Profile: abstract, citation, fulltext, holdings, ill, and any. So the recipient of an OpenURL request, using the “fulltext” Service Type, should be able to replicate the proprietary PubGet pathing engine using a standard OpenURL structure. In a brief bit of experimentation, though, I was not able to find an OpenURL resolver that a) knew how to handle a Service Type parameter, and/or b) knew how to honor that parameter by getting directly to the full text. As OpenURL undergoes its 5-year review this year, it might be worthwhile to emphasize this part of the standard with examples and descriptions of best practices so it is more widely adopted.

Other Articles on PubGet

Footnotes

  1. Since the articles are not held within the PubGet service itself, the bulk article downloading function requires a Firefox plugin so that the article requests come from your browser to the publisher’s site. []

4 Tweets

3 Comments

  1. Jay Luker | August 5, 2009 at 11:28 am | Permalink

    The SFX link resolver is capable of both a & b. It recognizes the service type parameter and is capable of direct linking to to the desired content. The SFX knowledge base is, in effect, a “pathing engine” that constructs target urls by plugging the citation data into a known url structures.

    The fact that you have to initiate the search at the PubGet seems like a definite downside. That said, there is still something disruptively appealing about it. Maybe just the more familiar search UI patterns.

    (disclaimer: I’m an SFX developer)

  2. the Jester | August 5, 2009 at 2:51 pm | Permalink

    Thanks, Jay — I didn’t know that about SFX or its knowledge base. I wonder how often it is implemented. I chatted with OhioLINK’s staff liaison to our user services committee; Zoe reaffirmed that our public services folks definitely want that intermediate screen. Perhaps for the same reasons that EBSCOhost wants that intermediate screen.

    If I get any spare time (hah!) I’m going to try constructing some OpenURL against the leading journal publisher/aggregator targets and see if I get back something that resembles full text for the “fulltext” Service Type.

  3. Jens | December 20, 2009 at 8:36 am | Permalink

    Thanks, very interesting info!

3 Trackbacks

  1. Analyse de PubGet : pintiniblog | August 28, 2009 at 8:30 am | Permalink

    Kramer auto Pingback[...] Analysis of PubGet — An Expedited Fulltext Service for Life Science Journal Articles [...]

  2. Up to the Waves: Pubget: Pros and Cons | September 6, 2009 at 5:52 pm | Permalink

    Kramer auto Pingback[...] have been testing Pubget for some time and finally came up with this summary. Since quite a lot of good things and excitement were covered about Pubget, my point of view might be a bit different in that I approached the service from the point of end [...]

  3. Analyse de PubGet « pintiniblog | January 8, 2010 at 9:28 am | Permalink

    [...] Analyse de PubGet Analysis of PubGet — An Expedited Fulltext Service for Life Science Journal Articles [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *
Human Detection Scheme
(What's this?)
Comment Preview

Additional comments powered by BackType

Subscribe without commenting

From the Disruptive Library Technology Jester (http://dltj.org/), printed on Tuesday the 9th of February 2010 at 6:22:29 AM EST (-0500). The URL to this page is http://dltj.org/article/analysis-of-pubget-an-expedited-fulltext-service-for-life-science-journal-articles/

[Creative Commons Logo] This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.