On Saturday morning of ALA Midwinter 2010, Dr. Jennifer Younger moderated a session on the progress of the OCLC Record Use Policy Council. The meeting started with an introduction to the reasons behind the creation of the Record Use Council, the charge of the Council from the board of trustees, and how the framing of the discussion of the policy is guided by the values and history of OCLC the cooperative. There wasn't much new here for those that have been following the progress of the policy discussion, so I am skipping over it most of it with the exception of a few notable topics. After that, I'm focusing on the lengthy question and answer session that followed Dr. Younger's background presentation.
Highlights of the Background Presentation
Dr. Younger said that the review council is on track to get the proposed policy to the OCLC Board of Trustees in May in time for it to be reviewed at the Board's June meeting. They haven't started putting pen to paper on a draft policy statement, but are close; next week the members of the Council will be in Dublin for a two day meeting, and coming out of that will be a draft of the policy. From there, the draft policy will be reviewed by the various governance bodies of OCLC -- the regional council, the global council, and the board of trustees -- and there will be an extensive discussion about the draft policy at the global council meeting in April.
WorldCat itself is now made up of 170 million bibliographic records and 1.5 billion statements of holdings from libraries. A policy is needed to create a viable business plan for sustaining this resource.
What the policy will cover: rights and responsibilities of members that have created WorldCat -- the rights of members to use elements of WorldCat and the shared responsibilities to the members of the cooperative that go along with the rights; identifying acceptable use by third parties; what are OCLC's rights to use the records on behalf of the members; and a process for collective participation in reviewing and modifying the policy over time. It will also have a "rather robust" preamble that answers the question of why a policy is needed, what problem is the policy is trying to solve, and what it is about WorldCat that necessitates a policy.
An Aside: What's In a Name -- OCLC-the-membership and OCLC-the-stewards
The discussion of the record use policy is intertwined with the conversations of governance of the cooperative, and I think it is important to be aware that there are many facets to the OCLC name as it is commonly used. In some cases we use "OCLC" to mean the cooperative, or -- more specifically -- the members of the cooperative. To be more precise, I will usually refer to this group as "OCLC-the-membership." In other cases, it means the conglomeration of staff, hardware/software, and services centered at buildings in Dublin, OH. Previously I have called the latter "OCLC-the-corporate" but in the course of the record use policy council discussion, Jay Jordon took issue with this phrase and said he preferred "OCLC-the-steward." Names carry nuances, and I agree with Jay that OCLC-the-steward is a better name to call the entity that is serving OCLC-the-membership.
The View from the Database Level
What the representatives from the review council said their work focused on WorldCat as a database of records that OCLC-the-steward is managing on behalf of OCLC-the-membership. The council has gotten away from the discussion of individual records in favor of the value of the WorldCat database -- its data, services, and infrastructure -- as a whole. They recognize that the value and use of WorldCat is not only to know about a book (its metadata) but where it is located (the attached holdings). More specifically, the review council identified three kinds of value from WorldCat:
- As a supply of bibliographic records.
- The ability to represent library holdings -- the collective collection of libraries and the capability to reveal what libraries have in places like Google Book Search.
- Knowledge organization pieces: taking the shared contribution of libraries and makes something more from it using authority control, terminologies, the Dewey Decimal classification system, FRBR work sets, etc.
It was interesting to note a non-U.S. perspective that the council has heard regarding the value of WorldCat. While most North American libraries strongly value WorldCat as a supply of bibliographic records (copy cataloging), the national libraries outside of North America are joining because adding their records to WorldCat gives greater visibility to their holdings. So the second and third value propositions above carry more weight than the first, which is arguably the most valuable aspect for North American libraries.
The challenge the Council said it is facing is to put enough controls in place to protect the value and viability of WorldCat while allowing enough flexibility for members, non-members, and OCLC to experiment and derive new, valuable services. One of the questions the review council is grappling with is how can the Cooperative use "community norms" to ensure the responsibilities assigned to the members are followed so we govern ourselves.
In taking this database-wide view, the council has set aside issues of individual record ownership and copyright of data in records and focused on what is valuable about the collection of records as a whole to the membership. WorldCat as a whole collective is copyrighted. As explained in the follow-up discussion with members of the council, the intellectual property law surrounding WorldCat records extends across many juristictions, so the council chose to focus at the database level.
The review council heard of the need for clarification on how libraries must be able to extend rights to third party agents acting on behalf of a member library, and acknowledged the need to outline the responsibilities of members as they work with OCLC WorldCat records using non-OCLC-member third parties and agents of member libraries. There are efforts in the policy council to structure the resulting policy such that OCLC-the-steward would take the responsibility for policing third-party data activities (presuming, of course, the OCLC member notifies OCLC-the-steward that the activity is taking place). It was stated that there are companies that want to get WorldCat records with OCLC enhancements -- the control number, the fields upgraded by the internal WorldCat auditing software, etc. -- by paying for them once, or not paying for them at all, and resell them to other customers. These are viewed as attempts to profit off of what the cooperative has built without giving anything back to the cooperative. The policy is intended to help OCLC-the-steward prevent this from happening, not to limit what member libraries themselves can do.
In licensing WorldCat data to others, OCLC-the-steward is looking for remuneration of some sort for OCLC-the-cooperative. If the business use by a non-member third party is one that will harm the value and viability of the WorldCat Network, then the policy council wants to see it governed in some way. Remuneration can be in monetary form, where that external party pays a fee for the data. Or it can be in a non-monetary form, such as the quid pro quo with the internet search engines that have WorldCat data and in return drive traffic back to local libraries through linkage on WorldCat.org. As Jay Jordan put it sucinctly, "I'll do a contract with anyone that returns value to the cooperative. Oftentimes, that is not cash." It was stated that there have been attempts to download the entire WorldCat database. In order to be able to legally stop that, there must be a policy in place that prohibits it.
WorldCat as Linked Data
I asked a question about whether anyone was advocating for the benefits to the world in general, and the specific example was putting WorldCat data into the semantic web. Significant portions of WorldCat data is freely available in a human-readable form, but not in a way that makes it easy for a machine to process and make relationships to other data -- a form of data representation commonly called "linked data." For example, Google as an entity can come and negotiate for the rights and responsibilities to use WorldCat data as part of its services. There isn't a corresponding entity in the semantic web world to come along and negotiate for the dissemination of basic facts about items in WorldCat to the linked data universe. The council has talked about the distinction between "public good" and "club (member) good." Some of this distinction is intended to be explained in the preamble. Linked data is a form of innovation that the council doesn't want to shut down. They are trying to find how this get encouraged without shutting it down in the policy.
In reflecting on these notes and what else happened in the course of the meeting, I came up with other questions that might be valuable for the Record Use Policy Council to think about.
- In Jennifer's introduction, she talked about not only the value of the bibliographic records but also the value of the holdings. Has the Council looked a differing policies for bibliographic information versus holdings?
- The discussion of linked data was incomplete due to time constraints. Has there been a discussion about a differentiation of value for different types or views of data? Machine access version human-oriented access? Linked data of some portion of the bibliographic record? Is the representation of the benefit of the world in general being taken into account in drafting policy guidelines?