<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"><channel><title>Disruptive Library Technology Jester &#187; digitization</title> <atom:link href="http://dltj.org/tag/digitization/feed/" rel="self" type="application/rss+xml" /><link>http://dltj.org</link> <description>We&#039;re Disrupted, We&#039;re Librarians, and We&#039;re Not Going to Take It Anymore</description> <lastBuildDate>Mon, 06 Feb 2012 20:04:22 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <cloud domain='dltj.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' /> <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/3.0/us/</creativeCommons:license> <item><title>Thursday Threads: Personal Book Digitizer, Status of Book Piracy, Core Elements of Description</title><link>http://dltj.org/article/thursday-threads-2011w3/</link> <comments>http://dltj.org/article/thursday-threads-2011w3/#comments</comments> <pubDate>Thu, 20 Jan 2011 11:50:44 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Thursday Threads]]></category> <category><![CDATA[digital rights management]]></category> <category><![CDATA[digitization]]></category> <category><![CDATA[Karen Smith-Yoshimura]]></category> <category><![CDATA[MARC]]></category> <category><![CDATA[metadata]]></category> <category><![CDATA[piracy]]></category> <category><![CDATA[publishing]]></category> <category><![CDATA[textbook]]></category><guid isPermaLink="false">http://dltj.org/?p=2330</guid> <description><![CDATA[Receive DLTJ Thursday Threads:by&#160;E-mailby&#160;RSSDelivered by FeedBurnerIt wasn&#8217;t too long ago that the music industry was in an uproar about stories of how easy it was to copy digital audio files and make digital copies with high fidelity. It was predicted &#8230; <a href="http://dltj.org/article/thursday-threads-2011w3/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=2330"></abbr><div id="feedburner-thursday-threads-email-2011w03" class="wp-caption alignright noprint noFrontPage" style="width: 230px;;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><form style="border: 1px solid rgb(204, 204, 204); padding: 3px; margin: 0pt; text-align: center;" action="http://feedburner.google.com/fb/a/mailverify" method="post" target="popupwindow" onsubmit="window.open('http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads', 'popupwindow', 'scrollbars=yes,width=550,height=520');return true"><p>Receive <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym></i> Thursday Threads:</p><p>by&nbsp;<a href="http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads&amp;loc=en_US" title="D.L.T.J. Thursday Threads Email Subscription">E-mail</a><br /><input style="width: 140px;" name="email" value="Your e-mail address" onfocus="if (this.defaultValue==this.value) this.value = ''" type="text"/><input value="thursday-threads" name="uri" type="hidden"/><input name="loc" value="en_US" type="hidden"/><input value="Subscribe" type="submit"/></p><p>by&nbsp;<a href="http://feeds.dltj.org/thursday-threads/" title="D.L.T.J. Thursday Threads RSS Feed">RSS</a></p><p style="font-size: 80%;">Delivered by <a href="http://feedburner.google.com" target="_blank" title="Google Feedburner Service">FeedBurner</a></p></form></div><p>It wasn&#8217;t too long ago that the music industry was in an uproar about stories of how easy it was to copy digital audio files and make digital copies with high fidelity.  It was predicted that we would see the same thing in other media forms, and this week&#8217;s <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym> Thursday Threads</i> has two stories on the topic of book publishing.  First is news of another inexpensive and simple (and now to be commercially produced) <a href="#booksaver">book digitizing system</a>.  Although the process of &#8220;ripping&#8221; a book from its physical medium might take longer than an audio track, these kind of devices are emerging that will make it simple to do.  What happens with the digital copy after that?  The second Thursday Threads pointer is to an <a href="#book-piracy">interview</a> with the founder of book publishing industry consultant about the state of book piracy, how it is measured, and why digital rights management software is a poor way to stop it.  The last entry this week is a <a href="#corebibdescr">short excerpt of a brief summary</a> of a study conducted by OCLC last year on the usage of MARC tags in cataloging records.<br /><span id="more-2330"></span><br />As a side note, apologies to <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym></i> readers that had problems reading some of the content here over the past couple of weeks.  A series of problems with my personal server &#8212; driven by the fact, I believe, that the server was first set up about 10 years ago and all the patches, tweaks, and updates over the decade have finally driven performance into the ground &#8212; prompted me to migrate this blog to Amazon&#8217;s Web Services cloud.  It is now running on a micro <a href="http://aws.amazon.com/ec2/" title="Amazon Elastic Compute Cloud (Amazon EC2)">Elastic Cloud Computing (EC2)</a> virtual machine backed by <a href="http://aws.amazon.com/s3/" title="Amazon Simple Storage Service (Amazon S3)">Simple Storage Service (S3)</a> and the <a href="http://aws.amazon.com/cloudfront/" title="Amazon CloudFront">CloudFront</a> content distribution network.  I&#8217;ve also been optimizing the snot out of configuration &#8212; employing all sorts of new tricks for reducing the time it takes to deliver pages to your browser.  I have another blog post in draft with the details for when anyone (even me!) wants to replicate it.  Given enough personal time, watch for that in the next week or so.</p><p>All of that said, if you are seeing things that don&#8217;t look or function right, <a href="http://dltj.org/contact/">please let me know</a>.</p><p><h2 id="booksaver">Book Saver &#8211; A personal book digitization setup from ION</h2><br /><div id="attachment_2333" class="wp-caption alignright" style="width: 310px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><a href="http://www.ionaudio.com/booksaver" title="http://www.ionaudio.com/booksaver"><img src="http://cdn.dltj.org/wp-content/uploads/2011/01/booksaver_angle_lrg-300x187.jpg" alt="Booksaver from ION" title="Booksaver from ION" width="300" height="187" class="size-medium wp-image-2333" /></a><br /><iframe title="YouTube video player" class="youtube-player" type="text/html" width="298" height="198" src="http://www.youtube.com/embed/annCmIa-a08" frameborder="0"></iframe><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Picture and Demonstration Video of the Book Saver from ION</p></div></p><blockquote><p>Book Saver has two cameras that take separate images in rapid succession of each page within an open book. Both cameras of Book Saver also have a flash for allowing the page to be fully illuminated during the scanning process. Book Saver’s cradle, where the book is placed during the scanning process, is also angled as to not require you to hold pages down to get a flat, even surface. While similar devices require up to seven seconds per one page, Book Saver takes only one second per two pages!</p></blockquote><p>News of the new <a href="http://www.ionaudio.com/booksaver" title="http://www.ionaudio.com/booksaver">Book Saver</a> product comes from <a href="http://www.librarybazaar.com/2011/01/15/book-saver-vs-drm/" title="Book Saver vs. DRM? | Library Bazaar">Fiacre O&#8217;Duinn</a>.  It is a hand-held device for digitizing book materials.  The promotional literature says it takes about 15 minutes to digitize a 200-page book.  The product was <a href="http://www.ionaudio.com/content380172" title="http://www.ionaudio.com/content380172">announced</a> in time for the Consumer Electronics Show earlier this month, but is not yet available.  It is expected to ship this summer with a <a href="http://www.crunchgear.com/2011/01/12/ion-audio-book-saver-does-just-that-saves-books/" title="Ion Audio Book Saver Does Just That, Saves Books">manufacturer&#8217;s suggested retail price of $189</a> (I&#8217;m already seeing price points of <a href="http://www.mobilemag.com/2011/01/12/ions-book-saver-book-scanner-scans-200-page-books-in-15-minutes/" title="Ion Book Scanner digitizes your 200-page books in 15 minutes for eReading | Mobile Magazine">$149</a> mentioned).</p><p>One of the &#8220;Key Features&#8221; listed on the product page is that the device &#8220;eliminates the need to purchase electronic versions of reading material you already own.&#8221;  As Fiacre points out in his post, this really brings down the cost (in equipment and in effort) of digitally reproducing books.  Are we about to see a new wave of personal book sharing/piracy?  And what will the impact on libraries be?  In the higher education arena, it is already being mentioned as a way to <a href="http://www.hackcollege.com/blog/2011/1/10/hands-on-with-the-ion-audio-book-saver.html" title="Hands On with the Ion Audio Book Saver | HackCollege">digitize textbooks</a>.  It is conceivable that students would <a href="http://dltj.org/article/textbooks-on-reserve/" title="Textbooks On Reserve Program at Miami University | DLTJ">borrow textbooks</a> from our libraries, digitize them in an afternoon, and return them &#8212; or maybe just digitize them in the library.  Do we need to get ahead of devices like this with education and policy initiatives?</p><p><h2 id="book-piracy">Book Piracy: Less DRM, More Data</h2></p><blockquote><p>As digital book publishing continues to expand at a rapid pace to meet reader demands, piracy rears its head at the forefront of many a discussion in publisher circles. Many publishers respond to the perceived threat with strict digital rights management (DRM) software. But is this the best solution? And does it even provide protection from piracy?</p><p>In the following interview, <a href="http://magellanmediapartners.com/" title="Magellan Media Partners">Magellan Media</a> founder and TOC 2011 speaker <a href="http://www.toccon.com/toc2011/public/schedule/speaker/5146?cmp=il-radar-tc11-oleary-piracy" title="Speaker: Brian O’Leary: O'Reilly Tools of Change for Publishing Conference 2011 - O'Reilly Conferences, February 14 - 16, 2011, New York">Brian O&#8217;Leary</a> (<a href="http://twitter.com/brianoleary" title="http://twitter.com/brianoleary">@brianoleary</a>) discusses the current state of book piracy, how measurement data isn&#8217;t sufficient to determine its impact, and why DRM is a poor anti-piracy tool.</p></blockquote><p>The same arguments in favor of digital rights management for the music sector are now being made in the book publishing sector. <a href="http://radar.oreilly.com/2011/01/book-piracy-drm-data.html" title="Book piracy: Less DRM, more data - O'Reilly Radar">This interview</a> comes from the perspective of why DRM is the wrong answer to the perceived problem of book piracy.  The backdrop is <a href="https://en.oreilly.com/toc2011/public/register?cmp=il-radar-tc11-oleary-piracy">O&#8217;Reilly Media&#8217;s Tools of Change for Publishing</a> conference to be held next month in New York City.</p><p><h2 id="corebibdescr">Core Bibliographic Description</h2></p><blockquote><p>Those “outliers” can be categorized according to three general purposes:</p><ul><li><em>Provenance and Identity</em>: identifiers (e.g. ISBN, OCLC, etc.) and cataloging source (040)</li><li><em>Elements useful for discovery:</em> title statement (245), personal names (100, 700) and subject (650)</li><li><em>Elements useful for understanding and evaluation:</em> publication statement (260), physical description (300), and notes (500)</li></ul><p>That’s it. In a nutshell you have the very core of bibliographic description as defined by librarians over the last century or so.</p></blockquote><p>This <a href="http://hangingtogether.org/?p=834" title="The Core of Bibliographic Description | hangingtogether.org">post</a> by <a href="http://hangingtogether.org/?page_id=207" title="Roy Tenant Biography">Roy Tenant</a> briefly summarizes the work of OCLC Research staff member <a href="http://www.oclc.org/research/people/smith-yoshimura.htm" title="Karen Smith-Yoshimura | OCLC - People">Karen Smith-Yoshimura</a>.  The research work was to <a href="http://www.oclc.org/research/activities/attributes/default.htm" title="Gather Evidence to Inform Changes in MARC Metadata Practices [OCLC - Activities]">gather evidence to inform changes in MARC metadata practices</a>, and that project page includes a <a href="http://www.oclc.org/research/publications/library/2010/2010-06.pdf" title="Implications of MARC Tag Usage on Library Metadata Practices report in pDF">72 page report</a> [PDF] and an Excel <a href="http://cdn.dltj.org/wp-content/uploads/2011/01/2010-06a.xls" title="Full Data Tables Related to MARC Tag Usage in WorldCat">spreadsheet of data tables</a> along with <a href="http://www5.oclc.org/downloads/research/webinars/20100318mtu.wmv" title="Audio in WMV format of results webinar">audio</a> and <a href="http://www5.oclc.org/downloads/research/webinars/20100318mtu.mp4" title="Video recording in MPEG4 format of the results webinar">video</a> of a <a href="http://www.catalogingfutures.com/catalogingfutures/2010/04/webinar-implications-of-marc-tag-usage-on-library-metadata.html" title="Cataloging Futures: Webinar: Implications of MARC tag usage on library metadata">one hour webinar</a> on the report.  In my <a href="http://friendfeed.com/dltj/710d04c0/core-of-bibliographic-description-oclc" title="The Core of Bibliographic Description | Peter Murray's FriendFeed">FriendFeed posting of Roy&#8217;s article</a>, <a href="http://waltcrawford.name/" title="Walt Crawford">Walt Crawford</a> noted a similar finding in his 1986 <a href="http://books.google.com/books?id=9NXgAAAAMAAJ&#038;dq=Bibliographic+Displays+in+the+Online+Catalog&#038;hl=en&#038;ei=ZHI3TeCzLIH-8Ab79s2cBA&#038;sa=X&#038;oi=book_result&#038;ct=result&#038;resnum=1&#038;ved=0CC8Q6AEwAA" title="Bibliographic displays in the online catalog | Google Book Search">Bibliographic displays in the online catalog</a>.  As Walt notes, &#8220;somehow it&#8217;s not surprising that it&#8217;s still true in 2010.&#8221;</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/thursday-threads-2011w3/feed/</wfw:commentRss> <slash:comments>8</slash:comments> <enclosure url="http://www5.oclc.org/downloads/research/webinars/20100318mtu.wmv" length="68512623" type="video/asf" /> <enclosure url="http://www5.oclc.org/downloads/research/webinars/20100318mtu.mp4" length="288204112" type="video/mp4" /> </item> <item><title>Thursday Threads: Amazon Pressures Publishers, Academic Spam, Mechanical Turk Spam, Multispectral Imaging</title><link>http://dltj.org/article/thursday-threads-2010w52/</link> <comments>http://dltj.org/article/thursday-threads-2010w52/#comments</comments> <pubDate>Thu, 30 Dec 2010 12:07:28 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Thursday Threads]]></category> <category><![CDATA[Amazon]]></category> <category><![CDATA[Amazon Mechanical Turk]]></category> <category><![CDATA[digitization]]></category> <category><![CDATA[Google Scholar]]></category> <category><![CDATA[jpeg2000]]></category> <category><![CDATA[preservation]]></category> <category><![CDATA[publishing]]></category> <category><![CDATA[search engine]]></category> <category><![CDATA[spam]]></category><guid isPermaLink="false">http://dltj.org/?p=1931</guid> <description><![CDATA[Receive DLTJ Thursday Threads:by&#160;E-mailby&#160;RSSDelivered by FeedBurner With the close of the year approaching, this issue marks the 14th week of DLTJ Thursday Threads. This issue has a publisher&#8217;s view of Amazon&#8217;s strong-arm tactics in book pricing, research into the possibility &#8230; <a href="http://dltj.org/article/thursday-threads-2010w52/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=1931"></abbr><div id="feedburner-thursday-threads-email-w52" class="wp-caption alignright" style="width: 230px;;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><form style="border:1px solid #ccc;padding:3px;margin:0;text-align:center;" action="http://feedburner.google.com/fb/a/mailverify" method="post" target="popupwindow" onsubmit="window.open('http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads', 'popupwindow', 'scrollbars=yes,width=550,height=520');return true"><p>Receive <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym></i> Thursday Threads:</p><p>by&nbsp;<a href="http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads&#038;loc=en_US" title="D.L.T.J. Thursday Threads Email Subscription">E-mail</a><br /><input type="text" style="width:140px" name="email" value="Your e-mail address" onFocus="if (this.defaultValue==this.value) this.value = ''"/><input type="hidden" value="thursday-threads" name="uri"/><input type="hidden" name="loc" value="en_US"/><input type="submit" value="Subscribe" /></p><p>by&nbsp;<a href="http://feeds.dltj.org/thursday-threads/" title="D.L.T.J. Thursday Threads RSS Feed">RSS</a></p><p style="font-size: 80%">Delivered by <a href="http://feedburner.google.com" target="_blank" title="Google Feedburner Service">FeedBurner</a></p></form></div><p> With the close of the year approaching, this issue marks the 14th week of <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym> Thursday Threads</i>.  This issue has a publisher&#8217;s view of Amazon&#8217;s strong-arm tactics in book pricing, research into the possibility that academic authors could game Google Scholar with spam, demonstrations of how Amazon&#8217;s Mechanical Turk drives down the cost of enlisting humans to overwhelm anti-spam systems, and a story of multispectral imaging adding information in the process of digital preservation.</p><p>As the new year approaches, I wish you the best professionally and personally.</p><p><h2><a name="books_after_amazon">Books After Amazon</a></h2></p><blockquote><p>What happens when an industry concerned with the production of culture is beholden to a company with the sole goal of underselling competitors? Amazon is indisputably the king of books, but the issue remains, as Charlie Winton, CEO of the independent publisher Counterpoint Press puts it, “what kind of king they’re going to be.” A vital publishing industry must be able take chances with new authors and with books that don’t have obvious mass-market appeal. When mega-retailers have all the power in the industry, consumers benefit from low prices, but the effect on the future of literature—on what books can be published successfully—is far more in doubt.</p></blockquote><p><a href="http://www.bostonreview.net/BR35.6/roychoudhuri.php" title="Boston Review &amp;mdash; Onnesha Roychoudhuri: Books After Amazon">Onnesha Roychoudhuri publishes this view of Amazon&#8217;s marketing practices</a> in the lastest issue of the <a href="http://www.bostonreview.net/" title="Boston Review &amp;mdash; Home">Boston Review</a>.  From the publisher&#8217;s pespective, the strong-arm tactics described sound horrible.  But the story also points to cracks appearing &#8212; at least for the bigger publishers.  That may leave smaller, independent publishers in a big squeeze.  [Via OCLC Research's <a href="http://www.oclc.org/research/publications/newsletters/abovethefold/2010-12-17.htm" title="http://www.oclc.org/research/publications/newsletters/abovethefold/2010-12-17.htm">Above-the-Fold</a>]</p><p><h2><a name="academic_spam">Academic Search Engine Spam and Google Scholar&#8217;s Resilience Against it</a></h2></p><blockquote><p>Abstract: In a previous paper we provided guidelines for scholars on optimizing research articles for academic search engines such as Google Scholar. Feedback in the academic community to these guidelines was diverse. Some were concerned researchers could use our guidelines to manipulate rankings of scientific articles and promote what we call ‘academic search engine spam’. To find out whether these concerns are justified, we conducted several tests on Google Scholar. The results show that academic search engine spam is indeed—and with little effort—possible: We increased rankings of academic articles on Google Scholar by manipulating their citation counts; Google Scholar indexed invisible text we added to some articles, making papers appear for keyword searches the articles were not relevant for; Google Scholar indexed some nonsensical articles we randomly created with the paper generator SciGen; and Google Scholar linked to manipulated versions of research papers that contained a Viagra advertisement. At the end of this paper, we discuss whether academic search engine spam could become a serious threat to Web-based academic search engines.</p></blockquote><p><a href="http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;view=text;rgn=main;idno=3336451.0013.305" title="Academic Search Engine Spam and Google Scholar's Resilience Against it">Joeran Beel and Bela Gipp have this article</a> in the most recent issue of <a href="http://www.journalofelectronicpublishing.org/" title="The Journal of Electronic Publishing: Welcome">Journal of Electronic Publishing</a>.  In addition to being able to game <a href="http://scholar.google.com/" title="Google Scholar">Google Scholar</a>, the authors note that <a href="http://academic.research.microsoft.com/" title="Microsoft Academic Search">Microsoft Academic Search</a> and <a href="http://citeseer.ist.psu.edu/" title="CiteSeerX">CiteSeer</a> (as well as their own academic search engine currently under development &#8212; <a href="http://SciPlore.org/" title="SciPlore: Exploring Science">SciPlore</a>) have the same issues.  Although it is possible, we don&#8217;t know if it is being done &#8212; or even if there would be an penalties in the academic community for doing so.</p><p><h2><a name="mechanical_turk_spam">Mechanical Turk: Now with 40.92% spam</a></h2></p><blockquote><p>At this point, Amazon Mechanical Turk has reached the mainstream. Pretty much everyone knows about the concept. Post small tasks online, pay people cents, and get thousands of micro-tasks completed. Unfortunately, this resulted in some unfortunate trends. Anyone who frequents just a little bit the market will notice the tremendous number of spammy HITs. (HIT = a task posted for completion in the market; stands for Human Intelligence Task). &#8220;Test if the ads in my website work&#8221;. &#8220;Create a Twitter account and follow me&#8221;. &#8220;Like my YouTube video&#8221;. &#8220;Download this app&#8221;. &#8220;Write a positive review on Yelp&#8221;. A seemingly endless amount of spam HITs come to the market, mainly with the purpose of spamming &#8220;social media&#8221; metrics. So, with Dahn Tamir and Priya Kanth (MS student at NYU), we decided to examine how big is the problem. How many spammers join the market? How many spam HITs are there?</p></blockquote><p>This post from Panos Ipeirotis, Associate Professor at the IOMS Department at Stern School of Business of New York University, describes a <a href="http://behind-the-enemy-lines.blogspot.com/2010/12/mechanical-turk-now-with-4092-spam.html" title="Mechanical Turk: Now with 40.92% spam. - A Computer Scientist in a Business School">review of activities</a> posted to <a href="https://www.mturk.com/mturk/welcome">Amazon&#8217;s Mechanical Turk</a> service.  Spam is everywhere, and it appears that the Mechanical Turk is reducing the friction between buyers and workers of spam activity. [Via Ron Murray]</p><p><h2><a name="multispectral_imaging">Cutting-Edge Imaging Helps Scholar Reveal 8th-Century Manuscript</a></h2></p><blockquote><p>With a manuscript like the St. Chad Gospels, multispectral imaging—a series of scans, each based on a single part of the color spectrum—allows his team to create images that have the equivalent of three-dimensional detail, down to revealing the thickness of brush strokes on letters and illustrations. Cockled pages can be virtually flattened out so that all their details can be studied. Studied color band by color band, the chemical composition of ink can be determined.</p></blockquote><p>This <a href="http://chronicle.com/article/Cutting-Edge-Imaging-Helps/125616/" title="Cutting-Edge Imaging Helps Scholar Reveal 8th-Century Manuscript - Research - The Chronicle of Higher Education">article</a> by Jennifer Howard at the Chrnoicle of Higher Education reviews the story of how 8th-century documents in England were digitized by scholars at the University of Kentucky.  It caught my eye because of the mention of multispectral imaging; this is something that the JPEG2000 file format can natively store.  Digitization at this level doesn&#8217;t just provide alternative, online access to documents &#8212; it actually adds new information to the process of researching those documents.  [Note: the link is behind a publisher paywall. If you would like to see it, send me an e-mail and I'll forward you a short-term link from the Chronicle's website.]</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/thursday-threads-2010w52/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> <item><title>Interesting Google Book Search Settlement Bits in Advance of Thursday&#8217;s Fairness Hearing</title><link>http://dltj.org/article/interesting-gbs-bits/</link> <comments>http://dltj.org/article/interesting-gbs-bits/#comments</comments> <pubDate>Tue, 16 Feb 2010 02:22:04 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[policy]]></category> <category><![CDATA[digitization]]></category> <category><![CDATA[Google Book Search]]></category> <category><![CDATA[Judge Denny Chin]]></category> <category><![CDATA[legal]]></category> <category><![CDATA[WorldCat]]></category><guid isPermaLink="false">http://dltj.org/?p=1529</guid> <description><![CDATA[Thursday will be a big day in the Google Book Search lawsuit settlement: the parties to the lawsuit, along with the objectors, supporters, and friends-of-the-court, will be in the courtroom of United States District Judge Denny Chin offering oral arguments &#8230; <a href="http://dltj.org/article/interesting-gbs-bits/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=1529"></abbr><p>Thursday will be a big day in the Google Book Search lawsuit settlement:  the parties to the lawsuit, along with the objectors, supporters, and friends-of-the-court, will be in the courtroom of United States District Judge Denny Chin offering oral arguments in the final settlement/fairness hearing. <a href="http://docs.justia.com/cases/federal/district-courts/new-york/nysdce/1:2005cv08136/273913/930/0.html" title="The Author's Guild et al v. Google Inc. Document 930 - :: Justia Docs">In his order</a>, Judge Chin recognized 26 parties that will speak for up to five minutes each on their positions in the settlement (21 in opposition, 5 in favor).  The U.S. Department of Justice will also speak at the hearing.  But I think we&#8217;re all eagerly awaiting to hear what the judge himself will say about the settlement agreement.</p><p>In the lead-up to the hearing, Associate Professor <a href="http://james.grimmelmann.net/" title="James Grimmelmann homepage" rel="homepage">James Grimmelmann</a> at the New York Law School has continued <a href="http://laboratorium.net/" title="The Laboratorium">his efforts</a>, along with <a href="http://thepublicindex.org/about" title="About The Public Index">the students from the Institute for Information Law and Policy</a> at New York Law School, to make the documents and proceedings of the lawsuit accessible and understandable to non-lawyers.  In the most recent court filings leading up to Thursday&#8217;s hearing are some interesting nuggets.<br /><span id="more-1529"></span><br />In his <a href="http://laboratorium.net/archive/2010/02/15/gbs_a_little_on_the_fee_motion" title="The Laboratorium: GBS: A Little on the Fee Motion">posting</a> on the <a href="http://thepublicindex.org/docs/amended_settlement/Motion_for_fees.pdf" title="Notice of Motion and Motion for Approval of Attorneys' Fees and Reimbursement of Costs">motion for attorneys fees</a>, he notes that &#8220;counsel for the author sub-class are asking for the full $30 million in fees and reimbursement of their out-of-pocket costs.&#8221;  The filing contains information about the number of hours and the billing rate for some of the lawyers working on the case.  Some of the stuff is just really interesting, like <a href="http://thepublicindex.org/docs/amended_settlement/Dumain_Declaration.pdf" title="Declaration of Sanford P. Dumain in Support of Final Settlement Approval and Application of Counsel for the Author Sub-Class for Award of Fees and Reimbursement of Costs">one filing</a> that included everything from 18 hours by a partner of a firm (who is also a law professor at <acronym title="New York University">NYU</acronym>) at rate of $995/hour to an itemization of 51¢ for long distance calls by the firm related to the case.  Whew!</p><p>More interesting to <acronym title="Disruptive Library Technology Jester"><i>DLTJ</i></acronym> readers would be Grimmelmann&#8217;s <a href="http://laboratorium.net/archive/2010/02/15/gbs_some_highlights_of_dan_clancys_declaration" title="GBS: Some Highlights of Dan Clancy's Declaration">highlights</a> of <a href="http://thepublicindex.org/docs/amended_settlement/dan_clancy_declaration.pdf" title="Declaration of Daniel Clancy in Support of Motion for Final Approval of Amended Settlement Agreement">Dan Clancy&#8217;s declaration</a> in support of the agreement. <a href="http://www.computerhistory.org/events/index.php?spkid=0&amp;ssid=1246406058" title="Dan Clancy - Computer History Museum - Events">Dan Clancy</a> is engineering director of the Google Book Search project, so he has a unique insight into the inner workings.  Grimmlemann notes that Clancy states:<ul type="square"><li>To date, Google has Digitized over twelve million books, and intends to continue Digitizing books in the future.</li><li>Google has received metadata from 48 libraries.</li><li>Google pays approximately $2.5 million per year to license metadata from 21 commercial databases of information about books.</li><li>Google has gathered 3.27 billion records about Books, and analyzed them to identify more than 174 million unique works.</li></ul><p>The third bullet is interesting in that I think we can eliminate one of the &#8220;commercial databases&#8221; from the list.  I can&#8217;t find it in my notes from ALA Midwinter, but I seem to recall hearing <a href="http://www.oclc.org/about/trustees/members/jay_jordan.htm" title="Jay Jordan [OCLC - 2009-2010 Board members]">Jay Jordan</a> (<a href="http://www.linkedin.com/pub/jay-jordan/0/495/86" title="Jay Jordan - LinkedIn">OCLC President</a>) say something along the lines that OCLC was not receiving a monetary return from the sharing of bibliographic data with Google; the value OCLC gets for its membership comes from the links back to WorldCat from Google services.  If I got this wrong, I hope someone from OCLC will call me out on it.</p><p>The last bullet is interesting, too:  Google has identifying 174 million works in analyzing all of the sources of data coming into it.  I tried to find some numbers in the descriptions of WorldCat to compare that to, but didn&#8217;t have any luck this evening.  (There isn&#8217;t anything about statistics available on <a href="http://www.worldcat.org/" title="WorldCat Homepage" rel="homepage">http://worldcat.org/</a>?)</p><p>To Grimmelmann&#8217;s highlights I would add this statement that seems strangely out-of-place.</p><ul type="square"><li>Google has no interest in censorship. Indeed, Google&#8217;s mission is to organize the world&#8217;s information and make it universally accessible and useful.</li></ul><p>Has anyone brought censorship into the discussion yet?  Privacy for sure, but censorship?</p><p>Also:<ul type="square"><li>Google has developed algorithms to compare these numerous sources of metadata and identify the most accurate data about each book.</li></ul><p>They certainly seem to have invested a lot of effort in this area.  More info can be found in <a href="http://dltj.org/article/mashups-of-bib-data/">my summary of Kurt Groetsch&#8217;s presentation at ALA Midwinter 2010</a>.</p><div class='series_links'><a href='http://dltj.org/article/revised-gbs-settlement/' title='Revised Google Book Search Settlement from a Library Perspective'>Previous in series</a> <a href='http://dltj.org/article/gbs-settlement-rejected/' title='Google Book Search Settlement Rejected'>Next in series</a></div>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/interesting-gbs-bits/feed/</wfw:commentRss> <slash:comments>9</slash:comments> </item> <item><title>Preserving Digital Video</title><link>http://dltj.org/article/preserving-digital-video/</link> <comments>http://dltj.org/article/preserving-digital-video/#comments</comments> <pubDate>Tue, 08 Apr 2008 20:22:14 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[accessibility]]></category> <category><![CDATA[digitization]]></category> <category><![CDATA[preservation]]></category> <category><![CDATA[standards]]></category> <category><![CDATA[video]]></category><guid isPermaLink="false">https://dltj.org/?p=348</guid> <description><![CDATA[My place of work is looking to acquire educational videos in a digital form with an eye towards long-term preservation. At this point we receive a physical form (preferably DVD, but sometimes VHS) and digitize it to a very lossy &#8230; <a href="http://dltj.org/article/preserving-digital-video/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="https://dltj.org/?p=348"></abbr><p>My place of work is looking to acquire educational videos in a digital form with an eye towards long-term preservation.  At this point we receive a physical form (preferably DVD, but sometimes VHS) and digitize it to a very lossy access format (RealMedia, in this case).  With this change, we would get a preservation-worthy digital copy from the producer/distributor and forego the physical version.</p><p>There is quite a lot written on preserving video, but I wanted to distill the requirements down into statements that vendors could reasonably provide today.  I think these are pretty sound requirements, but I&#8217;m looking for feedback.  In particular, I&#8217;m not quite sure how to handle the transfer of closed caption text from the publisher/distributor; suggestions are welcome.<br /><span id="more-348"></span><br />[Jester's note:  I just realized that an earlier version of this posting went out to the net about two hours before this "final" version.  Sorry about publishing the work-in-progress early; I must have hit the wrong button in the new version of WordPress...]</p><p><h2>File Formats</h2><br />Some of the clearest guidance on file formats comes from this short excerpt from the Moving Image section of the <a href="http://www.ahds.ac.uk/" title="The Arts and Humanities Data Service homepage">U.K. Arts and Humanities Data Service</a> <a href="http://www.ahds.ac.uk/preservation/ahds-preservation-documents.htm" title="AHDS Repository Policies and Procedures">Preservation Handbook</a>:</p><blockquote><p>Guidance on the preservation of digital video should, by necessity, change over time. [...] The MPEG-2 and MPEG-4 formats are better suited to high-quality digital video. MPEG-2 is better known for its use as a format for DVD-Video, which encourages confidence when considering the likelihood that the format will be readable in the long-term. The format has an average transfer rate of 2-5 megabits per second, but there may be disk space restraints and the software tools necessary to convert and store this format are costly. MPEG-4 has a lower transfer rate of 1-2 megabits per second and is intended for streaming video. Other codecs, such as QuickTime, Windows Media, Real Video and Open DIVX, are useful for specific purposes, but not suitable for preservation. <sup><a href="http://dltj.org/article/preserving-digital-video/#footnote_0_348" id="identifier_0_348" class="footnote-link footnote-identifier-link" title="Knight, G., &amp;amp; McHugh, J. (2005). Preservation Handbook: Moving Image.  p. 3.">1</a></sup></p></blockquote><p>The Library of Congress Sustainability of Digital Formats site has <a href="http://www.digitalpreservation.gov/formats/fdd/fdd000028.shtml" title="http://www.digitalpreservation.gov/formats/fdd/fdd000028.shtml">an entry for MPEG-2</a> (also known as H.262) and <a href="http://www.digitalpreservation.gov/formats/fdd/fdd000155.shtml" title="MPEG-4 File Format, Version 2">an entry for MPEG-4</a> (more completely, MPEG-4 file format version #2) that give the nitty-gritty details for the file formats.</p><p>The preservation master copies we want to store has a frame size of 720 pixels by 480 pixels.  (That size is for NTSC format videos, common in USA, Canada and Japan.  Master copies of PAL-format videos, common in Australia, New Zealand, the United Kingdom and most of Europe, is 720 x 576.)  This is the standard resolution used in MPEG-2-compressed commercially distributed DVD movies.<sup><a href="http://dltj.org/article/preserving-digital-video/#footnote_1_348" id="identifier_1_348" class="footnote-link footnote-identifier-link" title="Audio/Video Capture and Management (2002).">2</a></sup> These frame sizes are appropriate for analog video signals.  (&#8220;As defined by ITU-R Recommendation BT.601, more commonly know by the abbreviations Rec. 601 or BT.601 or its former name, CCIR 601. [It is] a standard published by the CCIR (now ITU-R) for encoding interlaced analogue video signals in digital form.&#8221;<sup><a href="http://dltj.org/article/preserving-digital-video/#footnote_2_348" id="identifier_2_348" class="footnote-link footnote-identifier-link" title="&amp;#8220;Rec. 601&amp;#8243; (2008).">3</a></sup> )  The audio is 48KHz stereo at 224 kb/s or better.</p><p><h2>Captioning Text</h2><br />There appears to be two primary schemes for binding closed captioned text with video files.  One from the W3C is <a href="http://www.w3.org/AudioVideo/" title="http://www.w3.org/AudioVideo/">Synchronized Multimedia Integration Language</a> (or SMIL) is an XML format and is used by many media players.  The other is Microsoft&#8217;s <a href="http://msdn2.microsoft.com/en-us/library/ms971327.aspx" title="Object moved">Synchronized Accessible Media Interchange</a> (or SAMI), a pseudo-HTML format that is only read by Windows Media player.</p><p>To make matters more complicated, a whole set of different schemes are used for DVDs.  (On VHS recordings, closed caption text was encoded in one of the non-visible lines that make up the video signal.  Since the DVD format only included visible lines, other schemes were required.)  The most popular seems to be the <a href="http://www.fileinfo.net/extension/scc" title="SCC File Extension - Open .SCC files">Scenarist Closed Caption (SCC) format</a>.  This is a binary file that exists on the DVD along side the video files.</p><p><h2>Resources Consulted</h2></p><div style="line-height:1.1em;margin-left:0.5in;text-indent:-0.5in;margin-top:1.5em;"><p style="margin:0">Arms, C. R., &amp; Fleischhauer, C. Sustainability of Digital Formats: Planning for Library of Congress Collections. <span style="font-style:italic;">National Digital Information Infrastructure and Preservation Program</span>. Retrieved April 8, 2008, from <a href="http://www.digitalpreservation.gov/formats/" title="Sustainability of Digital Formats: Planning for Library of Congress Collections">http://www.digitalpreservation.gov/formats/</a>.</p><p style="margin:0"><span style="font-style:italic;">Audio/Video Capture and Management</span>. (2002).In <span style="font-style:italic;">NINCH Guide to Good Practice</span> (1st). Retrieved April 8, 2008, from <a href="http://www.nyu.edu/its/humanities/ninchguide/VII/" title="NINCH Guide to Good Practice">http://www.nyu.edu/its/humanities/ninchguide/VII/</a>.</p><p style="margin:0">Guideline H: Provide access to multimedia presentations for users with sensory disabilities. <span style="font-style:italic;">Accessible Digital Media: Design Guidelines for Electronic Publications, Multimedia and the Web</span>.  Retrieved 14-Apr-2008 from <a href="http://ncam.wgbh.org/invent_build/web_multimedia/accessible-digital-media-guide/guideline-h-multimedia" title="Accessible Digital Media: Guideline H: Multimedia">http://ncam.wgbh.org/publications/adm/guideline_h.html</a>.</p><p style="margin:0">Knight, G., &amp; McHugh, J. (2005). <span style="font-style:italic;">Preservation Handbook: Moving Image</span>. AHDS Preservation Handbook. 8 p. Arts and Humanities Data Service. Retrieved April 8, 2008, from <a href="http://www.ahds.ac.uk/preservation/video-preservation-handbook.pdf" title="AHDS&#039;s Preservation Handbook: Moving Image">http://ahds.ac.uk/preservation/video-preservation-handbook.pdf</a>.</p><p style="margin:0">Rec. 601. (2008, April 8).<span style="font-style:italic;">Wikipedia, the free encyclopedia</span>. Retrieved April 8, 2008, from <a href="http://en.wikipedia.org/wiki/Rec._601" title="http://en.wikipedia.org/wiki/Rec._601">http://en.wikipedia.org/wiki/Rec._601</a> (<a href="http://en.wikipedia.org/wiki/Rec._601?oldid=204278564" title="http://en.wikipedia.org/wiki/Rec._601?oldid=204278564">version at time of citation</a>).</p></div><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://ahds.ac.uk/ to http://www.ahds.ac.uk/ on January 28th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://ahds.ac.uk/preservation/ahds-preservation-documents.htm to http://www.ahds.ac.uk/preservation/ahds-preservation-documents.htm on January 28th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://ahds.ac.uk/preservation/video-preservation-handbook.pdf to http://www.ahds.ac.uk/preservation/video-preservation-handbook.pdf on January 28th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://ahds.ac.uk/preservation/video-preservation-handbook.pdf to http://www.ahds.ac.uk/preservation/video-preservation-handbook.pdf on January 28th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://ncam.wgbh.org/publications/adm/guideline_h.html to http://ncam.wgbh.org/invent_build/web_multimedia/accessible-digital-media-guide/guideline-h-multimedia on January 28th, 2011.</p><h2>Footnotes</h2><ol class="footnotes"><li id="footnote_0_348" class="footnote">Knight, G., &amp; McHugh, J. (2005). <span style="font-style:italic;"><a href="http://www.ahds.ac.uk/preservation/video-preservation-handbook.pdf" title="http://ahds.ac.uk/preservation/video-preservation-handbook.pdf">Preservation Handbook: Moving Image</a></span>.  p. 3.</li><li id="footnote_1_348" class="footnote"><a href="http://www.nyu.edu/its/humanities/ninchguide/VII/" title="Audio/Video Capture and Management chapter of NINCH Guide to Good Practice">Audio/Video Capture and Management</a> (2002).</li><li id="footnote_2_348" class="footnote">&#8220;Rec. 601&#8243; (2008).</li></ol>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/preserving-digital-video/feed/</wfw:commentRss> <slash:comments>5</slash:comments> </item> <item><title>A Glimpse into the Internet Archive&#8217;s Scanning and Print-on-Demand Operations</title><link>http://dltj.org/article/internet-archive-scanning-gallery/</link> <comments>http://dltj.org/article/internet-archive-scanning-gallery/#comments</comments> <pubDate>Thu, 20 Mar 2008 13:55:44 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[book]]></category> <category><![CDATA[Columbus OH]]></category> <category><![CDATA[digitization]]></category> <category><![CDATA[ebooks]]></category><guid isPermaLink="false">https://dltj.org/article/internet-archive-scanning-gallery/</guid> <description><![CDATA[Wired magazine published a brief story and online photo gallery of the book scanning and print-on-demand projects at the Internet Archive. It is a fascinating glimpse into their vision and processes. Included below are cropped thumbnails and part of the &#8230; <a href="http://dltj.org/article/internet-archive-scanning-gallery/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="https://dltj.org/article/internet-archive-scanning-gallery/"></abbr><p>Wired magazine published a <a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive" title="The Internet Archive Keeps Book-Scanning Free">brief story and online photo gallery of the book scanning and print-on-demand projects at the Internet Archive</a>.  It is a fascinating glimpse into their vision and processes.  Included below are cropped thumbnails and part of the text captions that accompanied the pictures in the Wired online gallery.</p><table><tr><td><a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive?slide=1&amp;slideView=2" title="The Internet Archive Keeps Book-Scanning Free"><img src="http://cdn.dltj.org/wp-content/uploads/2008/03/01_internet_archive_46_t.jpg" alt=""/></a></td><td>The book to be scanned sits in front of a technician underneath a V-shaped glass platter. Two opposing cameras angled at each page take photos of the book. On screen is the multipage view that the operator uses to verify the quality of the scans and the book&#8217;s pagination.</td></tr><tr><td><a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive?slide=2&amp;slideView=2" title="The Internet Archive Keeps Book-Scanning Free"><img src="http://cdn.dltj.org/wp-content/uploads/2008/03/02_comp2_t.jpg" alt=""/></a></td><td>Scanning books into the Internet Archive&#8217;s custom-built <a href="http://redjar.org/jared/blog/archives/2006/02/10/more-details-on-open-archives-scribe-book-scanner-project/" title="the future is yesterday  &amp;raquo; More details on Internet Archive&amp;#8217;s Scribe Book Scanner Project">Scribe Station</a> is a manual process. Although automated page-turning machines exist, Internet Archive has chosen to go the manual route due to the large amount of extremely delicate, rare and valuable manuscripts they scan.</td></tr><tr><td><a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive?slide=3&amp;slideView=2" title="The Internet Archive Keeps Book-Scanning Free"><img src="http://cdn.dltj.org/wp-content/uploads/2008/03/03_internet_archive_52_t.jpg" alt=""/></a></td><td>The book scanner uses off-the-shelf Canon hardware including the <a href="http://www.usa.canon.com/consumer/controller?act=ModelInfoAct&amp;fcategoryid=139&amp;modelid=10598" title="Canon Consumer Products">EOS 1-Ds Mark II</a> and the <a href="http://www.usa.canon.com/consumer/controller?act=ModelInfoAct&amp;fcategoryid=155&amp;modelid=7400" title="Canon Consumer Products">EF 100 mm f/2.8 macro lens</a>. The newer systems use the <a href="http://www.usa.canon.com/consumer/controller?act=ModelInfoAct&amp;fcategoryid=139&amp;modelid=11933" title="Canon Consumer Products">5-D</a> instead of the 1-Ds, which saves money in the short term. But, according to Internet Archive staff, the 5-D fails much more frequently, resulting in increased maintenance costs.</td></tr><tr><td><a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive?slide=4&amp;slideView=2" title="The Internet Archive Keeps Book-Scanning Free"><img src="http://cdn.dltj.org/wp-content/uploads/2008/03/04_internet_archive_50_t.jpg" alt=""/></a></td><td>At the start of every shift the operator calibrates the color levels using a pair of color-calibration cards. When the scanning project first started, Internet Archive attempted to color correct the scanned pages to white, but later decided to capture and store them as they are in their various aged shades of yellow. Preservation of the oxidized tints makes the virtual viewing of old books more lifelike.</td></tr><tr><td><a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive?slide=5&amp;slideView=2" title="The Internet Archive Keeps Book-Scanning Free"><img src="http://cdn.dltj.org/wp-content/uploads/2008/03/05_comp5_t.jpg" alt=""/></a></td><td>At the turn of the last century, fold-out illustrations were all the rage. These foldouts are cool to look at, but present a problem for scanning due to their size. When an operator comes across one of these foldouts in a book, they scan the closed version and note the foldout in the Scribe software. Later, another scanner is used consisting of a camera mounted on a copy stand.</td></tr><tr><td><a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive?slide=6&amp;slideView=2" title="The Internet Archive Keeps Book-Scanning Free"><img src="http://cdn.dltj.org/wp-content/uploads/2008/03/06_internet_archive_14_t.jpg" alt=""/></a></td><td>Soon, you&#8217;ll be able to print books found at the Internet Archive with this self-contained, fully automated book machine. Send it a PDF and it will print and bind it into a complete book. The process takes about 10 minutes depending on the size of the book, and costs $10 plus a penny per page.</td></tr><tr><td><a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive?slide=7&amp;slideView=2" title="The Internet Archive Keeps Book-Scanning Free"><img src="http://cdn.dltj.org/wp-content/uploads/2008/03/07_comp3_t.jpg" alt=""/></a></td><td>Inside the book machine, the laser-printed pages are trimmed, then slathered with adhesive on what will become the book&#8217;s spine. The cover is then wrapped around the book. After another trim, out pops a custom-printed book ready for reading.</td></tr><tr><td><a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive?slide=8&amp;slideView=2" title="The Internet Archive Keeps Book-Scanning Free"><img src="http://cdn.dltj.org/wp-content/uploads/2008/03/08_comp1_t.jpg" alt=""/></a></td><td>Instead of stacks of books, these archival volumes are now contained in racks of <a href="http://www.capricorn-tech.com/tbseries.php" title="http://www.capricorn-tech.com/tbseries.php">160 terabyte boxes</a>. Multiple redundant copies of the archive&#8217;s data are spread across servers all over the world.</td></tr><tr><td><a href="http://www.wired.com/entertainment/theweb/multimedia/2008/03/gallery_internet_archive?slide=9&amp;slideView=2" title="The Internet Archive Keeps Book-Scanning Free"><img src="http://cdn.dltj.org/wp-content/uploads/2008/03/09_internet_archive_55_t.jpg" alt=""/></a></td><td>Before entering the world of public-domain-promoting nonprofits, Robert Miller spent the last few decades at the top levels of various brick-and-mortar tech corporations. He is currently the director of books at the Internet Archive, and it&#8217;s his vision that drives the archive&#8217;s quest to digitize all public-domain knowledge and publish it online.</td></tr></table>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/internet-archive-scanning-gallery/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Vint Cerf on the Origins of 32-bit IP Addressing</title><link>http://dltj.org/article/vint-cerf-ip-addressing/</link> <comments>http://dltj.org/article/vint-cerf-ip-addressing/#comments</comments> <pubDate>Sat, 08 Mar 2008 03:55:42 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[digitization]]></category> <category><![CDATA[Google]]></category> <category><![CDATA[internet]]></category> <category><![CDATA[ipv6]]></category> <category><![CDATA[networking]]></category><guid isPermaLink="false">https://dltj.org/article/vint-cerf-ip-addressing/</guid> <description><![CDATA[Via a weekly wrap-up post by Dion Almaer on the Google Code Blog comes mention of a Google Tech Talk video from their IPv6 Conference 2008. It is a panel discussion called &#8220;What will the IPv6 Internet look like?&#8221; and &#8230; <a href="http://dltj.org/article/vint-cerf-ip-addressing/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="https://dltj.org/article/vint-cerf-ip-addressing/"></abbr><p>Via a <a href="http://google-code-updates.blogspot.com/2008/03/code-review-no-more-contact-scraping.html" title="Google Code Blog: The Code Review: No more contact scraping, sync your calendar, and Gears in your pocket">weekly wrap-up post by Dion Almaer on the Google Code Blog</a> comes mention of a Google Tech Talk video from their IPv6 Conference 2008.   It is a panel discussion called &#8220;<a href="http://www.youtube.com/watch?v=mZo69JQoLb8" title="YouTube - Google IPv6 Conference 2008:  What will the IPv6 Internet look like?">What will the IPv6 Internet look like?</a>&#8221; and it offers insight into the difficulties of transitioning to <a href="http://www.ipv6.org/" title="IPv6: The Next Generation Internet!">the next generation IP transport protocol</a>.  Although it has been years since I&#8217;ve seen the business end of managing an actual IP network, I found the discussion a fascinating look at the issues that are ahead of network engineers and device manufacturers around the world.</p><div style="float:right;width:410;padding:0 0 2.5em 3.5em"><object type="application/x-shockwave-flash" data="http://www.youtube.com/v/mZo69JQoLb8#13m1s" width="400" height="326"><param name="movie" value="http://www.youtube.com/v/mZo69JQoLb8#13m1s" /><param name="FlashVars" value="playerMode=embedded" /></object></div><p>The part that caught my ears, though, was an exchange between <a href="http://en.wikipedia.org/wiki/Vinton_Cerf" title="Vint Cerf article on Wikipedia">Vint Cerf</a>, vice president and chief internet evangelist at Google, and Bob Hinden, chief internet technologist at Nokia Networks.  It starts at 13 minutes and one second into the video with Vint as moderator of the panel addressing a question from the audience about whether the panelists are proud of the work done on IPv6.</p><dl><dt class="speaker">Vint Cerf</dt><dd>Well, just speaking for myself &#8212; like I said earlier this morning &#8212; I believe that v6 is the only thing that we can do right now to make sure that address space is available and that we preserve as much as possible the end to end structure of the network.</dd><dt class="speaker">Bob Hinden</dt><dd>Can I get one other comment in here?  You reminded me of something.  So back when Vint and everyone was starting the v4 &#8212; the current internet &#8212; was not a sure thing.  Back, you know, 15, 20 years ago.  And there were lots of &#8211;</dd><dt class="speaker">Vint Cerf</dt><dd>I&#8217;m sorry, it&#8217;s 30 years ago because the decision &#8212; [laughter].  No, I&#8217;m serious, the decision to put a 32-bit address space on there was the result of a year&#8217;s battle among a bunch of engineers who couldn&#8217;t make up their minds about 32, 128 or variable length.  And after a year of fighting I said &#8212; I&#8217;m now at ARPA, I&#8217;m running the program, I&#8217;m paying for this stuff and using American tax dollars &#8212; and I wanted some progress because we didn&#8217;t know if this is going to work.  So I said 32 bits, it is enough for an experiment, it is 4.3 billion terminations &#8212; even the defense department doesn&#8217;t need 4.3 billion of anything and it couldn&#8217;t afford to buy 4.3 billion edge devices to do a test anyway.  So at the time I thought we were doing a experiment to prove the technology and that if it worked we&#8217;d have an opportunity to do a production version of it.  Well &#8212; [laughter] &#8212; it just escaped! &#8212; it got out and people started to use it and then it became a commercial thing.  So, this [IPv6] is the production attempt at making the network scalable.  Only 30 years later.</dd></dl>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/vint-cerf-ip-addressing/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> <item><title>Soundprint&#8217;s &#8216;Who Needs Libraries?&#8217;</title><link>http://dltj.org/article/who-needs-libraries/</link> <comments>http://dltj.org/article/who-needs-libraries/#comments</comments> <pubDate>Fri, 01 Feb 2008 20:15:49 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Disruption in Libraries]]></category> <category><![CDATA[academic libraries]]></category> <category><![CDATA[audio]]></category> <category><![CDATA[California Digital Library]]></category> <category><![CDATA[digitization]]></category> <category><![CDATA[libraries]]></category> <category><![CDATA[preservation]]></category><guid isPermaLink="false">http://dltj.org/article/who-needs-libraries/</guid> <description><![CDATA[OhioLINK&#8217;s Meg Spernoga pointed our staff to a 30 minute audio documentary called Who Needs Libraries? from Soundprint.org:As more and more information is available on-line, as Amazon rolls out new software that allows anyone to find any passage in any &#8230; <a href="http://dltj.org/article/who-needs-libraries/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/article/who-needs-libraries/"></abbr><p>OhioLINK&#8217;s Meg Spernoga pointed our staff to a 30 minute audio documentary called <a href="http://www.soundprint.org/radio/display_show/ID/629/name/Who+needs+libraries" title="Who needs libraries?">Who Needs Libraries?</a> from Soundprint.org:</p><blockquote><p>As more and more information is available on-line, as Amazon rolls out new software that allows anyone to find any passage in any book, an important question becomes: Who needs libraries anymore? Why does anyone need four walls filled with paper between covers? Surprisingly, they still do and in this program Producer Richard Paul explores why; looking at how university libraries, school libraries and public libraries have adapted to the new information world. This program airs as part of our ongoing series on education and technology, and is funded in part by the U.S. Department of Education.</p><p>Produced by Richard Paul. Hosted by Lisa Simeone.</p></blockquote><p>Some of the topics covered:</p><ul type="square"><li>Numbers of New/Renovated public libraries are steady</li><li>Use of consortial depositories by academic libraries</li><li>Licensed content that can&#8217;t be found in Google (pros &#8212; immediate access; and cons &#8212; preservation)</li><li>Widespread digitization of content for online access (pros and cons)</li><li>Impact of Gates Foundation money on public library services</li><li>Changing ways libraries are being used</li></ul><p>Thanks, Meg!</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/who-needs-libraries/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Out of Print Books Get New Life via Amazon and Participating Libraries</title><link>http://dltj.org/article/amazon-kirtas-libraries/</link> <comments>http://dltj.org/article/amazon-kirtas-libraries/#comments</comments> <pubDate>Sun, 24 Jun 2007 12:31:38 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Disruption in Libraries]]></category> <category><![CDATA[Amazon]]></category> <category><![CDATA[book]]></category> <category><![CDATA[commerce]]></category> <category><![CDATA[digitization]]></category> <category><![CDATA[disruptive innovation]]></category> <category><![CDATA[libraries]]></category><guid isPermaLink="false">http://dltj.org/2007/06/amazon-kirtas-libraries/</guid> <description><![CDATA[Why settle for mere digital copies of books (a la the Google Book Search project and the Open Content Alliance) when you can have an edition printed, bound and sent to you in the mail? That&#8217;s the twist behind a &#8230; <a href="http://dltj.org/article/amazon-kirtas-libraries/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/06/amazon-kirtas-libraries/"></abbr><p>Why settle for mere digital copies of books (<i>a la</i> the <a href="http://books.google.com/" title="Google Book Search">Google Book Search project</a> and the <a href="http://www.opencontentalliance.org/" title="Open Content Alliance homepage">Open Content Alliance</a>) when you can have an edition printed, bound and sent to you in the mail?  That&#8217;s the twist behind a recent partnership announced by <a href="http://phx.corporate-ir.net/phoenix.zhtml?c=176060&#038;p=irol-newsArticle&#038;ID=1018605&#038;highlight=" title="Amazon press release">Amazon.com</a>, <a href="http://www.kirtas-tech.com/News.asp" title="Kirtas news page">Kirtas Technologies</a>, <a href="http://news.emory.edu/Releases/KirtasPartnership1181162558.html" title="Emory University News Release - Kirtas Partnership" class="broken_link" rel="nofollow">Emory University</a>, University of Maine, Toronto Public Library, and the Public Library of Cincinnati and Hamilton County.</p><p>More information via <a href="http://news.com.com/8301-10784_3-9732767-7.html" title="CNET News.com article &#039;Amazon enters book digitization jungle with rare-book project&#039;">C|Net News</a>, <a href="http://chronicle.com/weekly/v53/i44/44a02701.htm" title="Chronicle of Higher Education article">The Chronicle of Higher Education</a> (subscription required), and <a href="http://www.insidehighered.com/news/2007/06/22/digitize" title="Inside Higher Ed article &#039;An Alternative to Google&#039;">Inside Higher Ed</a>.  I&#8217;m putting this in the &#8220;Disruption in Libraries&#8221; category because it is an example of using a technical innovation to serve an un-served or under-served population &#8212; not only the digitization of books but also the ability to deliver a physical reproduction to the user.  That aspect makes this program distinct from the others, and it is the first time that we&#8217;ve seen a glimpse of a reasonable business model:  costs recovered and profits made that go back into the digitization program for new books.  Since this is a non-exclusive agreement that puts the libraries in control, the texts can be made available freely online or available at a nominal cost to the user in a physical form.</p><p>[Update 20070704T0904 : Ack!  I linked to the wrong Chronicle of Higher Ed article.  Fixed now -- thanks Jodi.]</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/amazon-kirtas-libraries/feed/</wfw:commentRss> <slash:comments>13</slash:comments> </item> </channel> </rss>
<!-- Served from: dltj.org @ 2012-02-11 12:23:32 by W3 Total Cache -->
