Thursday Threads: Amazon Pressures Publishers, Academic Spam, Mechanical Turk Spam, Multispectral Imaging

With the close of the year approaching, this issue marks the 14th week of DLTJ Thursday Threads. This issue has a publisher’s view of Amazon’s strong-arm tactics in book pricing, research into the possibility that academic authors could game Google Scholar with spam, demonstrations of how Amazon’s Mechanical Turk drives down the cost of enlisting humans to overwhelm anti-spam systems, and a story of multispectral imaging adding information in the process of digital preservation.

As the new year approaches, I wish you the best professionally and personally.

Books After Amazon

What happens when an industry concerned with the production of culture is beholden to a company with the sole goal of underselling competitors? Amazon is indisputably the king of books, but the issue remains, as Charlie Winton, CEO of the independent publisher Counterpoint Press puts it, “what kind of king they’re going to be.” A vital publishing industry must be able take chances with new authors and with books that don’t have obvious mass-market appeal. When mega-retailers have all the power in the industry, consumers benefit from low prices, but the effect on the future of literature—on what books can be published successfully—is far more in doubt.

Onnesha Roychoudhuri publishes this view of Amazon’s marketing practices in the lastest issue of the Boston Review. From the publisher’s pespective, the strong-arm tactics described sound horrible. But the story also points to cracks appearing — at least for the bigger publishers. That may leave smaller, independent publishers in a big squeeze. [Via OCLC Research’s Above-the-Fold]

Academic Search Engine Spam and Google Scholar’s Resilience Against it

Abstract: In a previous paper we provided guidelines for scholars on optimizing research articles for academic search engines such as Google Scholar. Feedback in the academic community to these guidelines was diverse. Some were concerned researchers could use our guidelines to manipulate rankings of scientific articles and promote what we call ‘academic search engine spam’. To find out whether these concerns are justified, we conducted several tests on Google Scholar. The results show that academic search engine spam is indeed—and with little effort—possible: We increased rankings of academic articles on Google Scholar by manipulating their citation counts; Google Scholar indexed invisible text we added to some articles, making papers appear for keyword searches the articles were not relevant for; Google Scholar indexed some nonsensical articles we randomly created with the paper generator SciGen; and Google Scholar linked to manipulated versions of research papers that contained a Viagra advertisement. At the end of this paper, we discuss whether academic search engine spam could become a serious threat to Web-based academic search engines.

Joeran Beel and Bela Gipp have this article in the most recent issue of Journal of Electronic Publishing. In addition to being able to game Google Scholar, the authors note that Microsoft Academic Search and CiteSeer (as well as their own academic search engine currently under development — SciPlore) have the same issues. Although it is possible, we don’t know if it is being done — or even if there would be an penalties in the academic community for doing so.

Mechanical Turk: Now with 40.92% spam

At this point, Amazon Mechanical Turk has reached the mainstream. Pretty much everyone knows about the concept. Post small tasks online, pay people cents, and get thousands of micro-tasks completed. Unfortunately, this resulted in some unfortunate trends. Anyone who frequents just a little bit the market will notice the tremendous number of spammy HITs. (HIT = a task posted for completion in the market; stands for Human Intelligence Task). “Test if the ads in my website work”. “Create a Twitter account and follow me”. “Like my YouTube video”. “Download this app”. “Write a positive review on Yelp”. A seemingly endless amount of spam HITs come to the market, mainly with the purpose of spamming “social media” metrics. So, with Dahn Tamir and Priya Kanth (MS student at NYU), we decided to examine how big is the problem. How many spammers join the market? How many spam HITs are there?

This post from Panos Ipeirotis, Associate Professor at the IOMS Department at Stern School of Business of New York University, describes a review of activities posted to Amazon’s Mechanical Turk service. Spam is everywhere, and it appears that the Mechanical Turk is reducing the friction between buyers and workers of spam activity. [Via Ron Murray]

Cutting-Edge Imaging Helps Scholar Reveal 8th-Century Manuscript

With a manuscript like the St. Chad Gospels, multispectral imaging—a series of scans, each based on a single part of the color spectrum—allows his team to create images that have the equivalent of three-dimensional detail, down to revealing the thickness of brush strokes on letters and illustrations. Cockled pages can be virtually flattened out so that all their details can be studied. Studied color band by color band, the chemical composition of ink can be determined.

This article by Jennifer Howard at the Chrnoicle of Higher Education reviews the story of how 8th-century documents in England were digitized by scholars at the University of Kentucky. It caught my eye because of the mention of multispectral imaging; this is something that the JPEG2000 file format can natively store. Digitization at this level doesn’t just provide alternative, online access to documents — it actually adds new information to the process of researching those documents. [Note: the link is behind a publisher paywall. If you would like to see it, send me an e-mail and I’ll forward you a short-term link from the Chronicle’s website.]

Thursday Threads: Open Publishing Alternatives, Open Bibliographic Data, Earn an MBA in Facebook, Unconference Planning

The highlights of the past week are around publishing — first with a model proposed by Eric Hellman in which consumers can pool enough money to pay publishers to “set a book free” under a Creative Commons license, then with an announcement by the University of Pittsburgh offering free hosting of open access e-journals. Since we have to be able to describe and find this content, their bibliographic descriptions are important; John Wilkin proposes a model for open access to elements of bibliographic descriptions. Rounding out this week’s topics are a report of a master’s degree program in business using Facebook, and tips for planning an unconference meeting.

Paying Publishers to Set their Content Free

[Eric] Hellman’s new model is something he calls GlueJar. He proposes to “unglue” e-books from their publishers so that they can be available to the world, DRM-free and under Creative Commons license. Here’s the model: publishers sign on with works that they want to “unglue.” They determine what they are willing to be paid for ungluing each work. Users contribute money towards the ungluing. When the threshold amount is reached for a given title, that title is unglued: it appears in all contributors’ e-book reader libraries and in repositories used for online public library access. The publisher is paid, and GlueJar takes a commission.

In other words, publishers just need to determine a price for content being taken off their hands, and if the public is willing to pay that price, it happens. (Users aren’t charged until works they want to unglue are unglued.) No more transaction costs; anyone can distribute the content to anyone else. Publishers could possibly retain subsidiary rights to the content, such as print on demand or derivative work rights.

Bill Rosenblatt of the Copyright and Technology blog looks at the problem publishers have of finding good content creators and having a model that makes that content widely available. Towards the end of his post, he summarizes Eric Hellman’s proposed model for “ungluing ebooks” in a way that makes sense for creators, publishers, and consumers. So far as I know, no one has taken Eric up on a trial of his model, but I think it would be interesting to see if it was practical. [Found via OCLC Research’s Above the Fold.]

University of Pittsburgh Library System Offers Free E-Journal Publishing Service

Pitt’s University Library System (ULS) is now offering free e-journal publishing services to help academic journals make their content available to a global audience while eliminating the cost of print production. 
The E-journal Publishing Program—part of ULS’ D-Scribe Digital Publishing Program, which partners with the University of Pittsburgh Press—“is in keeping with the ULS’ commitment to free and immediate access to scholarly information and its mission to support researchers in the production and sharing of knowledge in a rapidly changing publishing industry,” said Rush G. Miller, Hillman University Librarian and director of the ULS. 
The ULS trains a journal’s editorial staff in the use of Open Journal Systems (OJS) software, which channels the flow of scholarly content from initial author submissions through peer review and final online publication and indexing. OJS provides the tools necessary for the layout, design, copy editing, proofreading, and archiving of journal articles. The platform provides a vast set of reading tools to extend the use of scholarly content through RSS feeds and postings to Facebook and Twitter. E-journal articles can be discovered via blogs, databases, search engines, library collections, and other means. 

The University of Pittsburgh announced that it is offering the infrastructure for managing and hosting electronic journals with an at-cost print-on-demand supplement. Since the cost of the digital publishing platform is absorbed by the University of Pittsburgh and since peer review is typically done at no cost, what’s left on the expense side of the balance sheet? Paying the editorial staff? Marketing and advertising the journal? Has the University of Pittsburgh tipped the equation enough to make this model viable?

Open Bibliographic Data: How Should the Ecosystem Work?

In the conversations about openness of bibliographic data, I often find myself in an odd position, vehemently in support of it but almost as vehemently alarmed at the sort of rhetoric that circulates about the ways that data should be shared.

The problem with both the arguments OCLC makes and many of the arguments for openness seem to be predicated on the view that bibliographic data are largely inert, lifeless “records” and that these records are the units that should be distributed and consumed.

Nothing could be further from the truth.

The above quote is just one small piece of a posting by John Wilkin on the Open Knowledge Foundation blog. In it he plants a flag for the library profession to drive towards with bibliographic data that is published in a fine-grained, easily recombined manner. In being too focused on silos of “lifeless records” (WorldCat, local ILSs, Open Library, etc.), he suggests that the profession is missing out on ways we (and our users!) can combine and enhance bibliographic data. John’s statement is in parallel with a growing movement towards linked data, a movement that encompasses a reinvigorating of bibliographic description using FRBR and RDA (the current and progressive best thinking of the library community) with the foundational elements of the “semantic web” vision. For more on the latter, see the work of the W3C-supported Library Linked Data Incubator Group and the work of Karen Coyle and Diane Hillman, among others.

On a related note, the JISC community in the UK has also published the Open Bibliographic Data Guide. “It is about the business cases for Open Bibliographic Data – releasing some or all of a library’s catalogue records for open use and re-use by others.”

Poking, Tagging and Now Landing an M.B.A

But thanks to a pair of young British entrepreneurs, students who do want both a business education and the credential to prove it can now pursue their studies at the same time as they “poke” their friends, tag photos, update their relationship status or harvest their virtual crops on FarmVille.

The London School of Business and Finance Global M.B.A. bills itself as “the world’s first internationally recognized M.B.A. to be delivered through a Facebook application.”

Hmm — meet the students where they are? This story from the New York Times outlines an MBA program that is fully immersed in the Facebook environment. I wonder if the completion rate of a Facebook-based program will be higher than that of other online systems because users spend more time in the Facebook environment. [Via Steven Bell]

How I Planned a Successful Unconference in 6 hours – and You Can Too

Last Friday I ran WhereCamp5280 in Denver, which attracted over 70 people (many from out of state and a couple from Canada), used thousands of dollars from top-tier sponsors and was organized in probably less than six hours total. An unconference is a conference in the loosest of terms. People show up, we build our own agenda and then go for it. Here I’ll describe how it was run.

Steve Coast, a guest author for ReadWriteWeb, give this how-to guide for planning an unconference. An unconference is a relatively new style of event where the content of the meeting is defined by the people who show up and participate. The common guidelines for such meetings1 are: 1) The people who come are the best people who could have come; 2) Whatever happens is the only thing that could have happened; 3) It starts when it starts; 4) It’s over when it’s over; and 5) Exercise the Law of Two Feet. The last might take some more explanation; it means: “If you are not learning or contributing to a talk or presentation or discussion it is your responsibility to find somewhere where you can contribute or learn.”

In my experience, the unconference format is great if you want a group to brainstorm around a central idea or if you want to promote professional networking connections among a group. If you are looking for a particular outcome or have a specific agenda, this format does not work well.


  These rules are common, but I found them most clearly expressed at the Scratchpad Wikia.

Thursday Threads: Disruption in Library Acquisitions, Publishing, and Remedial Education plus Checking Assumptions of Cloud Computing and a National Digital Library

If it is Thursday it must mean it is time for another in this series of Thursday Threads posts. This week there are an abundance of things that could fall into the category of “disruptive innovation” in libraries and higher education. If you find these interesting, you might want to subscribe to my FriendFeed stream where these topics and more are posted and discussed throughout the week.
Federal Textbook Disclosure Rules Now Law

The fact that the Higher Education Opportunity Act (Public Law 110-315) — otherwise known as HEOA — was signed into law last year is probably not big news to anyone. One of the parts of the bill that I have been following and commented on here in DLTJ is the textbook disclosure rules. I haven’t posted follow-up commentary here because I’ve been expecting that the U.S. Department of Education will be forthcoming with new regulations regarding the implementation of the disclosure rules. As it turns out, a sentence was added into the legislation between the time I last read it closely and when it finally was made law: “No Regulatory Authority- The Secretary shall not promulgate regulations with respect to this section.” It would appear the language of the law stands on its own.

In December, Vincent Sampson in the Department of Education’s Office of Postsecondary Education wrote a 219-page “Dear Colleague” letter that provides summaries of provisions of HEOA. One summary covers the “Textbook Information” section (from pages 34 and 35, in its entirety):

The HEOA supports the academic freedom of faculty to select high quality course materials for their students while imposing several new provisions to ensure that students have timely access to affordable course materials at postsecondary institutions receiving Federal financial assistance. These provisions support that effort and include the following:

  • When textbook publishers provide information on a college textbook or supplemental material to faculty in charge of selecting course materials at postsecondary institutions, that information must be in writing (including electronic communication) and must include
    • the price of the textbook;
    • the copyright dates of the three previous editions (if any);
    • a description of substantial content revisions;
    • whether the textbook is available in other formats and if so, the price to the institution and to the general public;
    • the separate prices of textbooks unbundled from supplemental material; and
    • to the maximum extent possible, the same information for custom textbooks.
  • To the maximum extent practicable, an institution must include on its Internet course schedule for required and recommended textbooks and supplemental material
    • the International Standard Book Number (ISBN) and retail price;
    • if the ISBN is not available, the author, title, publisher, and copyright date; or
    • if such disclosure is not practicable, the designation “To Be Determined.”

    If applicable, the institution must include on its written course schedule a reference to the textbook information available on its Internet schedule and the Internet address for that schedule.

  • A postsecondary institution must provide the following information to its college bookstores upon request by such college bookstore:
    • the institution’s course schedule for the subsequent academic period; and
    • for each course or class offered, the information it must include on its Internet course schedule for required and recommended textbooks and supplemental material, the number of students enrolled, and the maximum student enrollment.
  • Institutions disclosing the information they must include on their Internet course schedules for required and recommended textbooks and supplemental material are encouraged to provide information on
    • renting textbooks;
    • purchasing used textbooks;
    • textbook buy-back programs; and
    • alternative content delivery programs.

The HEOA also requires the Government Accountability Office (GAO) to study the implementation of this section and report to Congress (See Non-institutional Studies, Reports, and Summits, U.S. Government Accountability Office (GAO) Studies and Reports, Textbook Information)

The Secretary is prohibited from regulating on this section of the HEA, but will monitor institutions and review student complaints relating to these provisions.

The law says that this provision “shall take effect on July 1, 2010” so schools have a little less than a year now to adjust their internal data gathering and reporting systems. I haven’t been able to find further guidance on the Department of Education website or at other sources. This effects Ohio’s efforts in promoting lower-cost, highly-effective course materials, so if anyone knows of other information, please let me know.


A colleague points out that the summary missed a crucial aspect of the legislation. Under publisher requirements, the law has this clause as well:
Unbundling of college textbooks from supplemental materials.– A publisher that sells a college textbook and any supplemental material accompanying such college textbook as a single bundle shall also make available the college textbook and each supplemental material as separate and unbundled items, each separately priced.

The text was modified to update a link from to on November 16th, 2012.

EBSCO in Cahoots With Harvard Business Press

A controversy is starting to pick up in the business librarian community — primarily in the U.K. it would seem — regarding the licensing demands of Harvard Business Press (HBP) for the inclusion of Harvard Business Review articles in EBSCOhost. HBP content in EBSCOhost carries a publisher-specific rider that says use is limited to “private individual use” and explicitly bars the practice of putting “deep links” of articles from EBSCOhost (so called “persistent links“) into learning management systems. In my words, HBP is attempting to limit access to its content in EBSCOhost to those who find it through the serendipity of searching. And now HBP is going after schools that are using persistent linking, and this raises all sorts of troubling questions.
Flat World Knowledge and U.S. Gov’t on Open Access Course Materials

The sand is really starting to shift under the traditional textbook providers as the open course content movement shows signs of, well, movement. Already this year there are two events that point to shifts in how instructors and students can shortcut the complex ecosystem of textbooks as we know it today. First, Flat World Knowledge — a provider of open access course materials — launched earlier this year. Second, new legislation has been proposed in the U.S. Congress to mandate that some agencies use their funding to produce open access course materials.

Flat World Knowledge Launches

Flat World Knowledge launched earlier this year, and it brings an entrepreneurial feel to the staid subject of textbooks. Billed as “the world’s first publisher of open-source college textbooks,” their website has a scrappy, web2.0 start-up feel to it. It should probably come as no surprise, then, that they are a web2.0 start-up — they recently received $8 million in venture capital funding. To faculty and staff in higher education, Flat World Knowledge describes themselves this way:

We preserve the best of the old – textbooks by leading experts.

Then we flip it on its head.

Our books cost $0 online.  We provide paperbacks, audio books, and self-print versions for under $30.  Our books are open for you to edit for your class.  Our new editions are on your terms.  We publish them – you decide if and when to use them.

They offer free versions of their textbooks online then charge for various derivatives and additions. Instructors can modify the textbook — rearranging chapters, add or delete chunks of text, and (coming soon according to the site) be able to add materials based on a database of what is available at Flat World Knowledge. (One has to register on the site to do this, but you can watch a video tutorial to get an idea about how it works.) Students get flexibility, too; one scenario from their website is:

Kayo doesn’t read books online. She orders the black and white softcover for about $29 bucks. It shows up in a few days. Too bland for her friend Sam – he orders the color edition for $59. Not Sharon. She commutes everyday, so nothing but the audio book on her iPod will do. Then there’s Chaz. He’s indecisive. He decides, well, not to decide. He’ll order the self-print .pdf chapters when he needs them for $1.99 per chapter. Cool. And don’t forget Tessa. She never has enough time. She’ll cut to the chase with our mp3 study guides, mobile flash cards, and online practice quizzes with feedback. That’s convenient. That’s choices. That’s Flat World Knowledge.

Right now their catalog is focused heavily on business topics, but they are looking to expand beyond it. (Into sociology, geographic information systems, and genetics according to their latest newsletter.) Here are the course materials available now and what they have in the pipeline.

TitleAuthor(s)Pub DateRelevant Course(s)
Exploring BusinessCollins, KarenFeb-09Introduction to Business
Fundamentals of Income Tax Theory and PracticeKiefer, DieterMar-09Federal Taxation; Federal and State Taxation
Introduction to Economic AnalysisMcAfee, R. Preston; Lewis, Tracy R.Mar-09Intermediate Microeconomics, Managerial Economics
Organizational BehaviorBauer, Talya; Erdogan, BerrinMar-09Organizational Behavior
Principles of ManagementCarpenter, Mason; Bauer, Talya; Erdogan, BerrinMar-09Principles of Management
Launch! Advertising and Promotion in Real TimeSolomon, Michael; Duke Cornell, Lisa; Nizan, AmitMar-09Advertising and Promotion
Principles of MacroeconomicsRittenberg, Libby; Tregarthen, TimothyApr-09Principles of Macroeconomics
Money and BankingWright, Robert E.; Quadrini, VincenzoApr-09Financial Markets and Institutions, Money and Banking
Principles of MicroeconomicsRittenberg, Libby; Tregarthen, TimothyApr-09Principles of Microeconomics
Risk Management for Enterprises and IndividualsBaranoff, Etti; Brockett, Patrick Lee; Kahane, YehudaApr-09Insurance, Risk Management
Atlas Black: Managing to SucceedShort, Jeremy; Bauer, Talya; Ketchen, Dave; Simon, LenApr-09Organizational Behavior, Principles of Management
Principles of EconomicsRittenberg, Libby; Tregarthen, TiMay-09Principles of Economics
Financial AccountingHoyle, Joe Ben; Skender, C. J.Oct-09Financial Accounting
Basics of Oral Business CommunicationMcLean, ScottOct-09Oral Business Communication
Basics of Written Business CommunicationMcLean, ScottOct-09Written Business Communication
Information Systems: A Manager’s Guide to Harnessing TechnologyGallaugher, JohnOct-09Management Information Systems
Principles of MarketingTanner, Jeff; Raymond, Mary Anne; Schuster, CamilleOct-09Principles of Marketing
Creative Destruction: The Economics of E-Commerce and the InternetKoch, JamesFeb-10Electronic Commerce
Personal FinanceSiegel, RachelFeb-10Personal Finance
Project Management in a Virtual WorldDarnall, Russell; Preston, John M.Feb-10Project Management
Sustainability, Innovation, and EntrepreneurshipLarson, AndreaFeb-10Entrepreneurship, Sustainability
Franchising: A Graphic NovelCombs, Jim; Ketchen, Dave; Short, Jeremy; Simon, LenMay-10Franchising, Small Business Mgmt

H.R. 1464 — The LOW COST Act

The title of this bill is cleverly named — the Learning Opportunities With Creation of Open Source Textbooks (LOW COST) Act. Let’s set aside my twitching in response to this use of phrase “open source” in this context — the correct form of “open” is probably “open access” — but that would ruin the acronym. (I had the same reaction to how the Flat World Knowledge folks used this phrase, too, so I should probably get over it.) The bill would mandate federal agencies that spend more than $10 million on science education to spend 2% of their budget on the development of related, college-level educational resources.


  1. In General- Not later than 1 year after the date of the enactment of this Act, the head of each agency that expends more than $10,000,000 in a fiscal year on scientific education and outreach shall use at least 2 percent of such funds for the collaboration on the development and implementation of open source materials as an educational outreach effort in accordance with subsection (b).
  2. Requirements- The head of each agency described in subsection (a) shall, under the joint guidance of the Director of the National Science Foundation and the Secretary of Energy, collaborate with the heads of any of the agencies described in such subsection or any federally supported laboratory or university-based research program to develop, implement, and establish procedures for checking the veracity, accuracy, and educational effectiveness of open source materials that–
    1. contain, at minimum, a comprehensive set of textbooks or other educational materials covering topics in college-level physics, chemistry, or math;
    2. are posted on the Federal Open Source Material Website;
    3. are updated prior to each academic year with the latest research and information on the topics covered in the textbooks or other educational materials available on the Federal Open Source Material Website; and
    4. are free of copyright violations.

The bill is sponsored by Representative Bill Foster of Illinois, and it is currently in the House committees on Education and Labor as well as Science and Technology. There are no co-sponsors to the bill, which I don’t think is a good sign, so I’m not expecting it to go far. Still, the sentiment is nice, so it is one to watch.

I’ve also heard through the grapevine that there is a bill being worked up to be proposed in the U.S. Senate that would set aside money for the development of open access course materials. So, at the very least, the notion of open access course materials seems to be catching on from top-down funders.

The text was modified to update a link from to on November 13th, 2012.

Online Editions of Out-of-Print Books Result from Library/Press Partnership at Univ of Pittsburgh

Late last month, the University of Pittsburgh Press and Library System announced a joint effort to revive 500 titles with online and print-on demand access. I originally found this via a post on the Course materials, Innovation, and Technology in Education (CITE) blog. Since we have been ramping up discussions here in Ohio about ways OhioLINK can be an aggregation point for efforts at the four university press services in Ohio, I was interested to read about this and learn more.
Clarification Offered for “Technology: The textbook of the future” in Nature

A recent issue of Nature published an article by Declan Butler called “Technology: The textbook of the future” included a paragraph about OhioLINK’s exploration of digital textbooks:

Ongoing tests of CourseSmart e-textbooks by the University System of Ohio show that they reduce costs — the average US student forks out some $900 annually on print textbooks — and students using them perform just as well as when using paper versions, says Peter Murray, deputy head of new service development at the Ohio Library and Information Network in Columbus, Ohio, which assists the University System of Ohio on the project.

I’m afraid I didn’t clarify the particulars of our efforts in the phone call with the reporter. Our test for effectiveness of electronic course materials was with a category of materials we call “enhanced textbooks”. They are the platforms that offer not only the text but also links to videos, glossary terms, pre- and post-texts, supplementary reading materials, and simulations. Examples of these are Wiley Plus from Wiley Publishing and the Campbell/Reece biology offerings from Pearson. Another program of the University System of Ohio is the e-Textbook portal featuring page-for-page replication e-books from CourseSmart. We have not tested the CourseSmart material for effectiveness compared to the identical material in printed form.

Butler, D. (2009). Technology: The textbook of the future Nature, 458 (7238), 568-570 DOI: 10.1038/458568a

The text was modified to update a link from to on November 13th, 2012.

What Does the Google Book Settlement Mean for the Online Book Market?

The blog post title is a serious question — it is one that I need some help figuring out: What Does the Google Book Settlement Mean for the Online Book Market? There have been stories and speculation about how Google is going to turn the settlement for the class-action lawsuit against its library book scanning project into a monopoly — or in the case of the recent Ars Technica article, a duopoly — of online publishing. I just don’t see it happening without the publishers explicitly allowing it to happen.

The crux of the issue is in the definition of a “book”; according to the preliminary settlement agreement (paragraph 1.16, page 3, emphasis added):

“Book” means a written or printed work that (a) if a “United States work,” as defined in 17 U.S.C. § 101, has been registered with the United States Copyright Office as of the Notice Commencement Date, (b) on or before the Notice Commencement Date, was published or distributed to the public or made available for public access as a set of written or printed sheets of paper bound together in hard copy form under the authorization of the work’s U.S. copyright owner, and (c) as of the Notice Commencement Date, is subject to a Copyright Interest.

The Notice Commencement Date was defined as January 5, 2009 in the court’s order granting preliminary settlement approval (paragraph 16, page 4). In other words, the definition of “book” as it relates to the settlement is something registered with the copyright office and published before January 5, 2009. The settlement does not cover books published after that date. If publishers want books published after that date to appear in the Google Book Search service, they must use the Google Books Partners Program. If Google scans a book that was registered and published after January 5th, it would appear that they would be subject to another lawsuit since the settlement agreement under consideration by the court provides no protection for such scans.

So what am I missing?

  1. There is nothing in the settlement agreement that covers books registered and published after January 5th.
  2. Publishers have a choice to use one or more distribution channels: bricks-and-mortar stores, e-commerce of physical copies, selling digital copies, adding digital copies to Google Book Search, and any other scheme they might dream up.
  3. Publishers don’t seem to be shifting their distribution activities to a Google-only model through the publisher partners program.
  4. A library will still need to purchase books for newly published materials through the channels that publishers choose.
  5. Google may face new lawsuits if it scans books from libraries that were registered/published after January 5, 2009.
  6. A subscription to the anticipated library licensing scheme for the Google Book Search database is not a substitute for traditional library collection development.

Are pundits and prognosticators assuming that the influence of the availability of book scanned from libraries published prior to January 5th will force publishers to put their newly-published content into Google Book Search? Is it conceivable that Google Book Search will be in such a dominant position as to compel publishers to release materials only through Google Books Publisher Partner Program?

If you see things differently, please help me out here…