Interesting Google Book Search Settlement Bits in Advance of Thursday’s Fairness Hearing

Posted on 3 minute read

× This article was imported from this blog's previous content management system (WordPress), and may have errors in formatting and functionality. If you find these errors are a significant barrier to understanding the article, please let me know.

Thursday will be a big day in the Google Book Search lawsuit settlement: the parties to the lawsuit, along with the objectors, supporters, and friends-of-the-court, will be in the courtroom of United States District Judge Denny Chin offering oral arguments in the final settlement/fairness hearing. In his order, Judge Chin recognized 26 parties that will speak for up to five minutes each on their positions in the settlement (21 in opposition, 5 in favor). The U.S. Department of Justice will also speak at the hearing. But I think we're all eagerly awaiting to hear what the judge himself will say about the settlement agreement.

In the lead-up to the hearing, Associate Professor James Grimmelmann at the New York Law School has continued his efforts, along with the students from the Institute for Information Law and Policy at New York Law School, to make the documents and proceedings of the lawsuit accessible and understandable to non-lawyers. In the most recent court filings leading up to Thursday's hearing are some interesting nuggets.

In his posting on the motion for attorneys fees, he notes that "counsel for the author sub-class are asking for the full $30 million in fees and reimbursement of their out-of-pocket costs." The filing contains information about the number of hours and the billing rate for some of the lawyers working on the case. Some of the stuff is just really interesting, like one filing that included everything from 18 hours by a partner of a firm (who is also a law professor at NYU) at rate of $995/hour to an itemization of 51¢ for long distance calls by the firm related to the case. Whew!

More interesting to DLTJ readers would be Grimmelmann's highlights of Dan Clancy's declaration in support of the agreement. Dan Clancy is engineering director of the Google Book Search project, so he has a unique insight into the inner workings. Grimmlemann notes that Clancy states:

  • To date, Google has Digitized over twelve million books, and intends to continue Digitizing books in the future.
  • Google has received metadata from 48 libraries.
  • Google pays approximately $2.5 million per year to license metadata from 21 commercial databases of information about books.
  • Google has gathered 3.27 billion records about Books, and analyzed them to identify more than 174 million unique works.

The third bullet is interesting in that I think we can eliminate one of the "commercial databases" from the list. I can't find it in my notes from ALA Midwinter, but I seem to recall hearing Jay Jordan (OCLC President) say something along the lines that OCLC was not receiving a monetary return from the sharing of bibliographic data with Google; the value OCLC gets for its membership comes from the links back to WorldCat from Google services. If I got this wrong, I hope someone from OCLC will call me out on it.

The last bullet is interesting, too: Google has identifying 174 million works in analyzing all of the sources of data coming into it. I tried to find some numbers in the descriptions of WorldCat to compare that to, but didn't have any luck this evening. (There isn't anything about statistics available on http://worldcat.org/?)

To Grimmelmann's highlights I would add this statement that seems strangely out-of-place.

  • Google has no interest in censorship. Indeed, Google's mission is to organize the world's information and make it universally accessible and useful.

Has anyone brought censorship into the discussion yet? Privacy for sure, but censorship?

Also:

  • Google has developed algorithms to compare these numerous sources of metadata and identify the most accurate data about each book.

They certainly seem to have invested a lot of effort in this area. More info can be found in my summary of Kurt Groetsch's presentation at ALA Midwinter 2010.

The text was modified to update a link from http://www.computerhistory.org/events/index.php?spkid=0&ssid=1246406058 to http://www.computerhistory.org/collections/catalog/102702332 on September 26th, 2013.