“We are scanning them to be read by an AI.”

Towards the end of the last chapter of his book, Nicholas Carr relates an anecdote about the visit of a guest speaker to the Google headquarters (emphasis added):

George Dyson, a historian of technology…, Freeman Dyson, was invited to Google’s headquarters in Mountain View, California, in October 2005 to give a speech at the party celebrating the sixtieth anniversary of von Neumann’s invention [of an electronic computer that could store in its memory the instructions for its use]. “Despite the whimsical furniture and other toys, “Dyson would later recall of his visit, “I felt I was entering a 14th-century cathedral — not in the 14th century but in the 12th century, while it was being built. Everyone was busy carving one stone here and another stone there, with some invisible architect getting everything to fit. The mood was playful, yet there was a palpable reverence in the air.” After his talk, Dyson found himself chatting with a Google engineer about the company’s controversial plan to scan the contents of the world’s libraries into its database. “We are not scanning all of those books to be read by people,” the engineer told him. “We are scanning them to be read by an [artificial intelligence engine].”

A Glimpse into the Internet Archive’s Scanning and Print-on-Demand Operations

Wired magazine published a brief story and online photo gallery of the book scanning and print-on-demand projects at the Internet Archive. It is a fascinating glimpse into their vision and processes. Included below are cropped thumbnails and part of the text captions that accompanied the pictures in the Wired online gallery.

The book to be scanned sits in front of a technician underneath a V-shaped glass platter. Two opposing cameras angled at each page take photos of the book. On screen is the multipage view that the operator uses to verify the quality of the scans and the book’s pagination.

New Blog for Ebooks in Libraries: “No Shelf Required”

Sue Polanka, head of reference and instruction at the main library of Wright State University, sent a message to the OhioLINK membership today about a new blog she is moderating called No Shelf Required:

No Shelf Required provides a forum for discussion among librarians, publishers, distributors, aggregators, and others interested in the publishing and information industry. The discussion will focus on the issues, concepts, current and future practices of Ebook publishing including: finding, selecting, licensing, policies, business models, usage (tracking), best practices, and promotion/marketing. The concept of the blog is to have open discussion, propose ideas, and provide feedback on the best ways to implement Ebooks in library settings. The blog will be a moderated discussion with timely feature articles and product reviews available for discussion and comment.

Out of Print Books Get New Life via Amazon and Participating Libraries

Why settle for mere digital copies of books (a la the Google Book Search project and the Open Content Alliance) when you can have an edition printed, bound and sent to you in the mail? That’s the twist behind a recent partnership announced by Amazon.com, Kirtas Technologies, Emory University, University of Maine, Toronto Public Library, and the Public Library of Cincinnati and Hamilton County.

Brewster Kahle on the Economics and Feasibility of Mass Book Digitization

Brewster Kahle, Director of the Internet Archive, was interviewed this week in a Chronicle of Higher Education podcast on the Economics and Feasibility of Mass Book Digitization. Among the many interesting points in the interview was that one of the biggest challenges is to such a mass digitization effort to believe that to digitize massive numbers of books and make them available is actually possible. The Open Content Alliance has put together a suite of technology that brings down the cost for a color scan with OCR to 10 cents per page or about $30 per book. He then goes on to perform this calculation: the library system in the U.S. is a 12B industry. One million books digitized a year is $30M, or “a little less than .3 percent of one year’s budget of the United States library system would build a 1 million book library that would be available to anyone for free.” He also covers copyright concerns including the more liberal copyright laws in countries such as China.

Just In Time Acquisitions versus Just In Case Acquisitions

What of a service existed where the patrons selected an item they needed out of our library catalog and that item was delivered to the patron even when the library did not yet own the item? Would that be useful? With the growth of online bookstores, our users do have the expectation of finding something they need on the web, clicking a few buttons and having it delivered. When such expectations of what is possible exist, where is the first place a patron would go to find recently published items — the online bookstore or their local library catalog? Does your gut tell you it is the online bookstore? Would it be desirable if the patron’s instinct were to be the local library catalog?