WorldCat May Become Available as Library Linked Data under ODC-BY

On the second day of the OCLC Global Council meeting [agenda PDF] there was a presentation by Robin Murray (VP, OCLC Global Product Management) and Jim Michalko (VP, OCLC Research Library Partnership) called "Linked Open Data". The title of the presentation was an understatement because the real heart of the matter was WorldCat data as linked open data. The presentation was about an hour long, and despite the technical difficulties was fascinating to listen to through the streaming video feed. OCLC says the archive of the meeting will be available at some point, and I urge you to check it out when it becomes available.

Robin Murray's first half of the presentation walked through OCLC's thoughts about how their implementation of linked data is evolving along side their programatic API offerings. (Almost as an aside, Robin said that use of the WorldCat API has been growing steadily and is now up to 15 million hits per month -- or 347 hits/minute. Wow!) He had a really slick diagram that showed the exposure of data through linked data services and API services across a half-dozen OCLC products. He called library linked data the "Data Exposure Service" of the WorldShare platform. There was one slide that I did copy down: with a title of "Opportunity or Threat", Robin proposed:

IF, collectively we can re-envision cataloging as 'registering nodes in a global web of data';

AND we can position WorldCat as the trusted, global web of library data;

THEN this dramatically increases the global value and utility of metadata management with WorldCat.

Robin went on to say that he sees publishing the OCLC cooperative's data assets as linked data is more opportunity than threat. That doing so grows the possible base of funding support for WorldCat; or, as he put it a "significant opportunity to 'Grow the Denominator'" (where the fixed costs of adding value to member contributed records is the numerator).

Jim Michalko's second half of the presentation walked through the history of the cooperative's Rights and Responsibilities document, past discussions about publishing WorldCat data, and lead up to a recommendation that OCLC leadership had for the Global Council and the Board of Trustees: publish WorldCat data as linked data with an Open Data Commons "BY" attribution license (a.k.a. ODC-BY). In fact, Jim explicitly said "no restrictions on commercial reuse" of individual records.

After the presentation and a short question/answer period, there was to be discussions at the tables where one of the questions was whether the Global Council would recommend to the Board of Trustees the adoption of ODC-BY to WorldCat data. Each table was to report the summary of their deliberations, but unfortunately that part of the meeting wasn't webcast. We may have to wait for formal minutes of the meeting to be published to see what the conclusion of the discussion was.

This discussion at Global Council about WorldCat data follows a similar announcement from Thom Hickey about the Virtual International Authority File being published with an ODC-BY license with attribution to VIAF. Jonathan Rochkind posted his appreciation and approval of the VIAF move but predicted a couple of pain points with the ODC-BY licensing. What he says there is also true of an ODC-BY licensed WorldCat.

Both Robin and Jim made a point of modeling their practices on what is happening in the general linked data community, but I'm concerned that a Google search for "ODC-BY linked data" doesn't show many examples beyond OCLC's efforts. It is quite possible that my Google results are artificially skewed towards library results. On the Code4Lib IRC channel it was noted, for instance, that the Linked Geo Data (derived from Open Street Map project) is or soon will be ODC-BY. [See clarification in the comments.] If there are other examples, I'd appreciate hearing about them in the comments.

There was also a discussion about whether WorldCat data could be given by members to Europeana, which announced it is using the CC0 public domain dedication. There is an important mismatch between CC0 and ODC-BY -- notably the attribution requirement in the latter that isn't in the former. I can't faithfully recall direct quotes from the discussion -- you'll have to watch the video archive -- but the summary I remember was that doing so would be okay as long as the member was being reasonable about what was being shared with Europeana. And that the same would hold true for the Digital Public Library of America.

On the whole, the presentation and discussion were fascinating to follow, and from my point of view represents a welcome and appropriate liberalization of how WorldCat data can be reused and the intention of OCLC to public bibliographic (and other data) to become a stable, generally usable anchor of library linked data. This is an important milestone to putting libraries and library data at the core of the linked open data movement, and it will spur innovation and uses that we can only dream about now.

Update -- 18-Apr-2012

News from Twitter:

As I recall the process, it now moves to approval by the OCLC Board of Trustees.


