What is known about GetFTR at the end of 2019

Posted on     14 minute read

In early December 2019, a group of publishers announced Get-Full-Text-Research, or GetFTR for short. There was a heck of a response on social media, and the response was—on the whole—not positive from my librarian-dominated corner of Twitter. For my early take on GetFTR, see my December 3rd blog post “Publishers going-it-alone (for now?) with GetFTR.” As that post title suggests, I took the five founding GetFTR publishers to task on their take-it-or-leave-it approach. I think that is still a problem. To get you caught up, here is a list of other commentary.

If you are looking for a short list of what to look at, I recommend these posts.

GetFTR’s Community Update

On December 11—after the two posts I list below—an “Updating the Community” web page was posted to the GetFTR website. From a public relations perspective, it was…interesting.

We are committed to being open and transparent

This section goes on to say, “If the community feels we need to add librarians to our advisory group we will certainly do so and we will explore ways to ensure we engage with as many of our librarian stakeholders as possible.” If the GetFTR leadership didn’t get the indication between December 3 and December 12 that librarians feel strongly about being at the table, then I don’t know what will. And it isn’t about being on the advisory group; it is about being seen and appreciated as important stakeholders in the research discovery process. I’m not sure who the “community” is in this section, but it is clear that librarians are—at best—an afterthought. That is not the kind of “open and transparent” that is welcoming.

Later on in the Questions about library link resolvers section is this sentence:

We have, or are planning to, consult with existing library advisory boards that participating publishers have, as this enables us to gather views from a significant number of librarians from all over the globe, at a range of different institutions.

As I said in my previous post, I don’t know why GetFTR is not engaging in existing cross-community (publisher/technology-supplier/library) organizations to have this discussion. It feels intentional, which colors the perception of what the publishers are trying to accomplish. To be honest, I don’t think the publishers are using GetFTR to drive a wedge between library technology service providers (who are needed to make GetFTR a reality for libraries) and libraries themselves. But I can see how that interpretation could be made.

Understandably, we have been asked about privacy.

I punted on privacy in my previous post, so let’s talk about it here. It remains to be seen what is included in the GetFTR API request between the browser and the publisher site. Sure, it needs to include the DOI and a token that identifies the patron’s institution. We can inspect that API request to ensure nothing else is included. But the fact that the design of GetFTR has the browser making the call to the publisher site means that the publisher site knows the IP address of the patron’s browser, and the IP address can be considered personally identifiable information. This issue could be fixed by having the link resolver or the discovery layer software make the API request, and according to the Questions about library link resolvers section of the community update, this may be under consideration.

So, yes, an auditable privacy policy and implementation is key for for GetFTR.

GetFTR is fully committed to supporting third-party aggregators

This is good to hear. I would love to see more information published about this, including how discipline-specific repositories and institutional repositories can have their holdings represented in GetFTR responses.

My Take-a-ways

In the second to last paragraph: “Researchers should have easy, seamless pathways to research, on whatever platform they are using, wherever they are.” That is a statement that I think every library could sign onto. This Updating the Community is a good start, but the project has dug a deep hole of trust and it hasn’t reached level ground yet.

Lisa Janicke Hinchliffe’s “Why are Librarians Concerned about GetFTR?”

Posted on December 10th in The Scholarly Kitchen, Lisa outlines a series of concerns from a librarian perspective. I agree with some of these; others are not an issue in my opinion.

Librarian Concern: The Connection to Seamless Access

Many librarians have expressed a concern about how patron information can leak to the publisher through ill-considered settings at an institution’s identity provider. Seamless Access can ease access control because it leverages a campus’ single sign-on solution—something that a library patron is likely to be familiar with. If the institution’s identity provider is overly permissive in the attributes about a patron that get transmitted to the publisher, then there is a serious risk of tying a user’s research activity to their identity and the bad things that come from that (patrons self-censoring their research paths, commoditization of patron activity, etc.). I’m serving on a Seamless Access task force that is addressing this issue, and I think there are technical, policy, and education solutions to this concern. In particular, I think some sort of intermediate display of the attributes being transmitted to the publisher is most appropriate.

Librarian Concern: The Limited User Base Enabled

As Lisa points out, the population of institutions that can take advantage of Seamless Access, a prerequisite for GetFTR, is very small and weighted heavily towards well-resourced institutions. To the extent that projects like Seamless Access (spurred on by a desire to have GetFTR-like functionality) helps with the adoption of SAML-based infrastructure like Shibboleth, then the whole academic community benefits from a shared authentication/identity layer that can be assumed to exist.

Librarian Concern: The Insertion of New Stumbling Blocks

Of the issues Lisa mentioned here, I’m not concerned about users being redirected to their campus single sign-on system in multiple browsers on multiple machines. This is something we should be training users about—there is a single website to put your username/password into for whatever you are accessing at the institution. That a user might already be logged into the institution single sign-on system in the course of doing other school work and never see a logon screen is an attractive benefit to this system.

That said, it would be useful for an API call from a library’s discovery layer to a publisher’s GetFTR endpoint to be able to say, “This is my user. Trust me when I say that they are from this institution.” If that were possible, then the Seamless Access Where-Are-You-From service could be bypassed for the GetFTR purpose of determining whether a user’s institution has access to an article on the publisher’s site. It would sure be nice if librarians were involved in the specification of the underlying protocols early on so these use cases could be offered.

Update

Lisa reached out on Twitter to say (in part): “Issue is GetFTR doesn’t redirect and SA doesnt when you are IPauthenticated. Hence user ends up w mishmash of experience.” I went back to read her Scholarly Kitchen post and realized I did not fully understand her point. If GetFTR is relying on a Seamless Access token to know which institution a user is coming from, then that token must get into the user’s browser. The details we have seen about GetFTR don’t address how that Seamless Access institution token is put in the user’s browser if the user has not been to the Seamless Access select-your-institution portal. One such case is when the user is coming from an IP-address-authenticated computer on a campus network. Do the GetFTR indicators appear even when the Seamless Access institution token is not stored in the browser? If at the publisher site the GetFTR response also uses the institution IP address table to determine entitlements, what does a user see when they have neither the Seamless Access institution token nor the institution IP address? And, to Lisa’s point, how does one explain this disparity to users? Is the situation better if the GetFTR determination is made in the link resolver rather than in the user browser?

Librarian Concern: Exclusion from Advisory Committee

See previous paragraph. That librarians are not at the table offering use cases and technical advice means that the developers are likely closing off options that meet library needs. Addressing those needs would ease the acceptance of the GetFTR project as mutually beneficial. So an emphatic “AGREE!” with Lisa on her points in this section. Publishers—what were you thinking?

Libraries and library technology companies are making significant investments in tools that ease the path from discovery to delivery. Would the library’s link resolver benefit from a real-time API call to a publisher’s service that determines the direct URL to a specific DOI? Oh, yes—that would be mighty beneficial. The library could put that link right at the top of a series of options that include a link to a version of the article in a Green Open Access repository, redirection to a content aggregator, one-click access to an interlibrary-loan form, or even an option where the library purchases a copy of the article on behalf of the patron. (More likely, the link resolver would take the patron right to the article URL supplied by GetFTR, but the library link resolver needs to be in the loop to be able to offer the other options.)

My Take-a-ways

The patron is affiliated with the institution, and the institution (through the library) is subscribing to services from the publisher. The institution’s library knows best what options are available to the patron (see above section). Want to know why librarians are concerned? Because they are inserting themselves as the arbiter of access to content, whether it is in the patron’s best interest or not. It is also useful to reinforce Lisa’s closing paragraph:

Whether GetFTR will act to remediate these concerns remains to be seen. In some cases, I would expect that they will. In others, they may not. Publishers’ interests are not always aligned with library interests and they may accept a fraying relationship with the library community as the price to pay to pursue their strategic goals.

Ian Mulvany’s “thoughts on GetFTR”

Ian’s entire post from December 11th in ScholCommsProd is worth reading. I think it is an insightful look at the technology and its implications. Here are some specific comments:

Clarifying the relation between SeamlessAccess and GetFTR

There are a couple of things that I disagree with:

OK, so what is the difference, for the user, between seamlessaccess and GetFTR? I think that the difference is the following - with seamless access you the user have to log in to the publisher site. With GetFTR if you are providing pages that contain DOIs (like on a discovery service) to your researchers, you can give them links they can click on that have been setup to get those users direct access to the content. That means as a researcher, so long as the discovery service has you as an authenticated user, you don’t need to even think about logins, or publisher access credentials.

To the best of my understanding, this is incorrect. With SeamlessAccess, the user is not “logging into the publisher site.” If the publisher site doesn’t know who a user is, the user is bounced back to their institution’s single sign-on service to authenticate. If the publisher site doesn’t know where a user is from, it invokes the SeamlessAccess Where-Are-You-From service to learn which institution’s single sign-on service is appropriate for the user. If a user follows a GetFTR-supplied link to a publisher site but the user doesn’t have the necessary authentication token from the institution’s single sign-on service, then they will be bounced back for the username/password and redirected to the publisher’s site. GetFTR signaling that an institution is entitled to view an article does not mean the user can get it without proving that they are a member of the institution.

What does this mean for Green Open Access

A key point that Ian raises is this:

One example of how this could suck, lets imagine that there is a very usable green OA version of an article, but the publisher wants to push me to using some “e-reader limited functionality version” that requires an account registration, or god forbid a browser exertion, or desktop app. If the publisher shows only this limited utility version, and not the green version, well that sucks.

Oh, yeah…that does suck, and it is because the library—not the publisher of record—is better positioned to know what is best for a particular user.

Will GetFTR be adopted?

Ian asks, “Will google scholar implement this, will other discovery services do so?” I do wonder if GetFTR is big enough to attract the attention of Google Scholar and Microsoft Research. My gut tells me “no”: I don’t think Google and Microsoft are going to add GetFTR buttons to their search results screens unless they are paid a lot. As for Google Scholar, it is more likely that Google would build something like GetFTR to get the analytics rather than rely on a publisher’s version.

I’m even more doubtful that the companies pushing GetFTR can convince discovery layers makers to embed GetFTR into their software. Since the two widely adopted discovery layers (in North America, at least) are also aggregators of journal content, I don’t see the discovery-layer/aggregator companies devaluing their product by actively pushing users off their site.

My Take-a-ways

It is also useful to reinforce Ian’s closing paragraph:

I have two other recommendations for the GetFTR team. Both relate to building trust. First up, don’t list orgs as being on an advisory board, when they are not. Secondly it would be great to learn about the team behind the creation of the Service. At the moment its all very anonymous.

Where Do We Stand?

Wow, I didn’t set out to write 2,500 words on this topic. At the start I was just taking some time to review everything that happened since this was announced at the start of December and see what sense I could make of it. It turned into a literature review of sort.

While GetFTR has some powerful backers, it also has some pretty big blockers:

  • Can GetFTR help spur adoption of Seamless Access enough to convince big and small institutions to invest in identity provider infrastructure and single sign-on systems?
  • Will GetFTR grab the interest of Google, Google Scholar, and Microsoft Research (where admittedly a lot of article discovery is already happening)?
  • Will developers of discovery layers and link resolvers prioritize GetFTR implementation in their services?
  • Will libraries find enough value in GetFTR to enable it in their discovery layers and link resolvers?
  • Would libraries argue against GetFTR in learning management systems, faculty profile systems, and other campus systems if its own services cannot be included in GetFTR displays?

I don’t know, but I think it is up to the principles behind GetFTR to make more inclusive decisions. The next steps is theirs.