<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"><channel><title>Disruptive Library Technology Jester &#187; Open Library</title> <atom:link href="http://dltj.org/tag/openlibrary/feed/" rel="self" type="application/rss+xml" /><link>http://dltj.org</link> <description>We&#039;re Disrupted, We&#039;re Librarians, and We&#039;re Not Going to Take It Anymore</description> <lastBuildDate>Mon, 06 Feb 2012 20:04:22 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <cloud domain='dltj.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' /> <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/3.0/us/</creativeCommons:license> <item><title>Thursday Threads: HarperCollins Ebook Terms, Internet Archive Ebook Sharing, Future of Collections</title><link>http://dltj.org/article/thursday-threads-2011w9/</link> <comments>http://dltj.org/article/thursday-threads-2011w9/#comments</comments> <pubDate>Thu, 03 Mar 2011 03:35:45 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Thursday Threads]]></category> <category><![CDATA[David Lewis]]></category> <category><![CDATA[disruptive innovation]]></category> <category><![CDATA[ebooks]]></category> <category><![CDATA[HarperCollins-OverDrive controversy]]></category> <category><![CDATA[Internet Archive]]></category> <category><![CDATA[licensing]]></category> <category><![CDATA[Open Library]]></category><guid isPermaLink="false">http://dltj.org/?p=2690</guid> <description><![CDATA[Receive DLTJ Thursday Threads:by&#160;E-mailby&#160;RSSDelivered by FeedBurner It is an all e-books edition of DLTJ Thursday Threads this week. The biggest news was the announcement of the policy change by HarperCollins for ebooks distributed through OverDrive. Beyond that, though, was an &#8230; <a href="http://dltj.org/article/thursday-threads-2011w9/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=2690"></abbr><div id="feedburner-thursday-threads-email-2011w09" class="wp-caption alignright noprint noFrontPage" style="width: 230px;;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><form style="border: 1px solid rgb(204, 204, 204); padding: 3px; margin: 0pt; text-align: center;" action="http://feedburner.google.com/fb/a/mailverify" method="post" target="popupwindow" onsubmit="window.open('http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads', 'popupwindow', 'scrollbars=yes,width=550,height=520');return true"><p>Receive <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym></i> Thursday Threads:</p><p>by&nbsp;<a href="http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads&amp;loc=en_US" title="D.L.T.J. Thursday Threads Email Subscription">E-mail</a><br /><input style="width: 140px;" name="email" value="Your e-mail address" onfocus="if (this.defaultValue==this.value) this.value = ''" type="text"/><input value="thursday-threads" name="uri" type="hidden"/><input name="loc" value="en_US" type="hidden"/><input value="Subscribe" type="submit"/></p><p>by&nbsp;<a href="http://feeds.dltj.org/thursday-threads/" title="D.L.T.J. Thursday Threads RSS Feed">RSS</a></p><p style="font-size: 80%;">Delivered by <a href="http://feedburner.google.com" target="_blank" title="Google Feedburner Service">FeedBurner</a></p></form></div><p> It is an all e-books edition of <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym> Thursday Threads</i> this week.  The biggest news was the <a href="#hcod">announcement of the policy change</a> by HarperCollins for ebooks distributed through OverDrive.  Beyond that, though, was an announcement of a <a href="#ia-ol-ill">new sharing model and program</a> through the Internet Archive.  Lastly is a slidecast recording of a presentation by David Lewis on the <a href="#collections-futures">future of library collections</a>.</p><p>Before continuing, a quick apology and explanation.  E-mail readers received a pair of extra Thursday Threads messages and RSS subscribers got a dump of unrelated posts; I&#8217;m sorry.  The cause was an update of this blog&#8217;s WordPress software to <a href="http://wordpress.org/news/2011/02/threeone/" title="WordPress 3.1, lots of fun">version 3.1</a> and a conflict (<a href="http://wordpress.org/support/topic/plugin-simple-tags-category-archive-wordpress-31" title="WordPress &#8250; Support &raquo; [Plugin: Simple Tags] Category Archive - WordPress 3.1">maybe this one</a>) with the <a href="http://wordpress.org/extend/plugins/simple-tags/" title="WordPress &#8250; Simple Tags &laquo; WordPress Plugins">SimpleTags</a> plugin.  I believe all is well, but I won&#8217;t know until this post is published.</p><p>Feel free to send this to others you think might be interested in the topics.  If you find these threads interesting and useful, you might want to add the <a href="http://feeds.dltj.org/thursday-threads/" title="RSS Feed for DLTJ Thursday Threads">Thursday Threads RSS Feed</a> to your feed reader or subscribe to e-mail delivery using the form to the right.  If you would like a more raw and immediate version of these types of stories, watch <a href="http://friendfeed.com/dltj" title="Peter Murray - FriendFeed">my FriendFeed stream</a> (or subscribe to <a href="http://friendfeed.com/dltj?format=atom" title="Atom feed for Peter Murray's FriendFeed account">its feed</a> in your feed reader).  Comments and tips, as always, are <a href="http://dltj.org/contact">welcome</a>.</p><p><h2 id="hcod">HarperCollins Puts 26 Loan Cap on Ebook Circulations</h2></p><blockquote><p>In the first significant revision to lending terms for  ebook circulation, HarperCollins has announced that new titles licensed from  library ebook vendors will be able to circulate only 26 times before the license  expires.</p><p>Mention of the new terms was first made in a letter from  OverDrive CEO Steve Potash to customers yesterday. He wrote  [emphasis in original]:</p><blockquote><p>[W]e have been required to  accept and accommodate new terms for eBook lending as <strong><em>established by certain  publishers</em>.</strong> Next week, OverDrive will communicate a licensing  change from a publisher that, while still operating under the one-copy/one-user  model, will include a checkout limit for each eBook licensed. Under this  publisher&#8217;s requirement, for every new eBook licensed, the library (and the  OverDrive platform) will make the eBook available to one customer at a time  until the total number of permitted checkouts is  reached.</p></blockquote><p>Though the letter leaves the publisher unnamed,  HarperCollins confirmed today  to <em>[Library Journal]</em> that it is the publisher referred  to.</p></p></blockquote><p>In an odd one-two punch, this past week saw a disturbance in the status quo of e-book licensing.  The first punch came in the <a href="http://librarianbyday.net/localwp-content/uploads/2011/02/OverDrive-Library-Partner-Update-from-Steve-Potash-2-24-2011.pdf" title="Letter from Steve Potash of Overdrive">letter from OverDrive</a> [PDF] (part of which is quoted in the Library Journal article excerpted above).  The second in <a href="http://www.libraryjournal.com/lj/home/889452-264/harpercollins_puts_26_loan_cap.html.csp" title="HarperCollins Puts 26 Loan Cap on Ebook Circulations | Library Journal">that Library Journal article</a> when we learned that the publisher pushing for the change of terms is HarperCollins.  Since then it has been the source of a great deal of discussion by librarians and a few <a href="http://www.courtneymilan.com/ramblings/2011/02/25/on-eating-your-seed-corn/" title="On eating your seed corn | Courtney Milan&#8217;s Blog">authors</a>, much of it in the form of <a href="http://search.twitter.com/search?q=%23hcod" title="#hcod - Twitter Search">tweets with the hash-tag &#8220;#hcod&#8221;</a> (short for HarperCollinsOverDrive).  Damage control comes in the form of open letters from <a href="http://overdriveblogs.com/library/2011/03/01/a-message-from-overdrive-on-harpercollins-new-ebook-licensing-terms/" title="A message from OverDrive on HarperCollins&#8217; new eBook licensing terms | OverDrive&#039;s Digital Library Blog">OverDrive</a> and <a href="http://harperlibrary.typepad.com/my_weblog/2011/03/open-letter-to-librarians.html" title="Open Letter to Librarians | Library Love Fest">HarperCollins</a>.  There has been a <a href="http://loosecannonlibrarian.net/?p=396" title="On Boycotts and Readers&#8217; Rights | Loose Cannon Librarian">call</a> for a <a href="http://boycottharpercollins.com/" title="Boycott HarperCollins">boycott</a>.  Bobbi Newman, <a href="http://librarianbyday.net/2011/02/25/publishing-industry-forces-overdrive-and-other-library-ebook-vendors-to-take-a-giant-step-back/" title="Publishing Industry Forces OverDrive and Other Library eBook Vendors to Take a Giant Step Back | Librarian by Day">one of the first to jump on the story</a>, is maintaining a <a href="http://www.delicious.com/librarianbyday/hcod" title="librarianbyday's hcod Bookmarks   on Delicious">list of news articles and commentary</a>.</p><div id="attachment_2673" class="wp-caption alignright" style="width: 310px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><br /><style type='text/css'>.bbpBox41505956953067520{background:url(http://a3.twimg.com/a/1298584552/images/themes/theme1/bg.png) #C0DEED;padding:20px}p.bbpTweet{background:#fff;padding:10px
12px 10px 12px;margin:0;min-height:48px;color:#000;font-size:18px !important;line-height:22px;-moz-border-radius:5px;-webkit-border-radius:5px}p.bbpTweet
span.metadata{display:block;width:100%;clear:both;margin-top:8px;padding-top:12px;height:40px;border-top:1px solid #fff;border-top:1px solid #e6e6e6}p.bbpTweet span.metadata
span.author{line-height:19px}p.bbpTweet span.metadata span.author
img{float:left;margin:0
7px 0 0px;width:38px;height:38px}p.bbpTweet a:hover{text-decoration:underline}p.bbpTweet
span.timestamp{font-size:12px;display:block}</style><div class='bbpBox41505956953067520'><p class='bbpTweet'>We&#8217;re reading your posts &#038; listening to our authors. If you want to share longer thoughts w us, email library.ebook@harpercollins.com <a href="http://twitter.com/search?q=%23hcod" title="#hcod" class="tweet-url hashtag" rel="nofollow">#hcod</a><span class='timestamp'><a title='Sat Feb 26 14:32:45 +0000 2011' href='http://twitter.com/#!/HarperCollins/status/41505956953067520' title="http://twitter.com/#!/HarperCollins/status/41505956953067520">Feb 26, 2011</a> via <a href="http://www.hootsuite.com" rel="nofollow" title="301 Moved Permanently">HootSuite</a></span><span class='metadata'><span class='author'><a href='http://twitter.com/HarperCollins' title="http://twitter.com/HarperCollins"><img src="http://cdn.dltj.org/wp-content/uploads/2011/03/FireWater_normal.gif" /></a><strong><a href='http://twitter.com/HarperCollins' title="http://twitter.com/HarperCollins">HarperCollins</a></strong><br />HarperCollins</span></span></p></div><p><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Tweet from HarperCollins</p></div><p>As you can see, much has already been said about the issue, and since collection development is not my specialty, you probably shouldn&#8217;t look to me for an informed opinion.  (If pressed, I will suggest that it is a perfectly reasonable collection development policy to not buy access to material with terms that are not in the library&#8217;s and patron&#8217;s best interest.)  Instead, there is so much oddness in this new policy that I find I can&#8217;t put myself in HarperCollins&#8217; shoes.  First, using OverDrive as a proxy for announcing this policy change seems wrong (and, frankly, unfair to OverDrive).  Then as word spreads, you don&#8217;t make your own announcement, but rather talk to a reporter from Library Journal.  Then as news spreads through the day, you send a <a href="http://twitter.com/#!/HarperCollins/status/41505956953067520" title="Tweet from HarperCollins">single tweet</a>.  In fact, you don&#8217;t really publicly respond <a href="http://harperlibrary.typepad.com/my_weblog/2011/03/open-letter-to-librarians.html" title="Open Letter to Librarians | Library Love Fest">until five days after</a> the twitter universe and biblio-blogosphere have been talking about it.  And it is pretty much a non-engaging, public relations response.  (You do get credit, though, for allowing open comments on your blog post.  But some of that credit is taken back because you aren&#8217;t using a company branded blog.  Really? A typepad.com blog?  One of the tenants of most information literacy courses I&#8217;ve seen is to look for the source of the information, and it requires extra effort to take this blog seriously because it isn&#8217;t in the harpercollins.com domain space.)</p><p>If I were to guess, this seems like a trial balloon that was badly floated.  I certainly can&#8217;t fault HarperCollins for trying something new in the ebook licensing world, but this one has fallen flat.</p><p><h2 id="ia-ol-ill">Internet Archive and Library Partners Develop Joint Collection of 80,000+ eBooks To Extend Traditional In-Library Lending Model</h2></p><blockquote><p>Today [February 22, 2011], a group of libraries led by the Internet Archive announced a new, cooperative <a href="http://openlibrary.org/borrow" title="Borrow Books (Open Library)">80,000+ eBook lending collection</a> of mostly 20th century books on OpenLibrary.org, a site where it’s already possible to read over 1 million eBooks without restriction. During a library visit, patrons with an OpenLibrary.org account can borrow any of these lendable eBooks using laptops, reading devices or library computers. This new twist on the traditional lending model could increase eBook use and revenue for publishers. &#8230;</p><p>Any OpenLibrary.org account holder can borrow up to 5 eBooks at a time, for up to 2 weeks. Books can only be borrowed by one person at a time. People can choose to borrow either an in-browser version (viewed using the Internet Archive’s BookReader web application), or a PDF or ePub version, managed by the free Adobe Digital Editions software. &#8230;</p><p>Publishers selling their eBooks to participating libraries include Cursor and OR Books. Books purchased will be lent to readers as well as being digitally preserved for the long-term. This continues the traditional relationship and services offered by publishers and libraries.</p></blockquote><p>This press release from the <a href="http://www.archive.org/post/349420/in-library-ebook-lending-program-launched" title="Internet Archive and Library Partners Develop Joint Collection of 80,000+ eBooks To Extend Traditional In-Library Lending Model">Internet Archive</a> largely went unnoticed on the eve of the <a href="#hcod">#hcod</a> onslaught.  It was covered in the <a href="http://chronicle.com/blogs/wiredcampus/collaboration-seeks-to-provide-easier-access-to-e-books/30054" title="Collaboration Seeks to Provide Easier Access to E-Books | The Chronicle of Higher Education Wired Campus blog">Chronicle of Higher Education&#8217;s Wired Campus blog</a> and in <a href="http://www.libraryjournal.com/lj/home/889508-264/internet_archive_tests_new_ebook.html.csp" title="Internet Archive Tests New Ebook Lending Waters: In-Library, and License-Free | Library Journal">Library Journal</a>.  The latter has a few more helpful details: &#8220;IA founder Brewster Kahle and director Peter Brantley also told <em>LJ</em> that small independent publishers <a href="http://thinkcursor.com/" title="Cursor homepage">Cursor</a>, <a href="http://www.orbooks.com/" title="OR Books homepage">OR Books</a>, and <a href="http://www.smashwords.com/" title="Smashwords homepage">Smashwords</a> will donate ebooks license-free to the Open Library for lending to all Open Library members. With this venture, IA hopes to establish a &#8220;first-sale precedent&#8221; for e-lending, according to Brantley.&#8221;  One must be from one of the <a href="http://openlibrary.org/libraries" title="Libraries (Open Library)">participating libraries</a> to check out books.  My experience with most Internet Archive efforts is that the initial announcement is very subtle and not picked up widely, then slowly grows to something substantial.  I expect this project will follow much the same path and will have a noticeable imprint on the profession in a few years.</p><p><h2 id="collections-futures">Slidecast of David Lewis’ “Collections Futures” Talk</h2></p><blockquote><ul type="circle"><li>Context<ul type="disc"><li>The Big Shift</li><li>Interlude with Clay Shirky</li><li>A Bit of Disruptive Innovation Theory</li></ul></li><li>Collections in “A Strategy for Academic Libraries in the First Quarter of the 21st Century”</li><li>What Will Be Easy and What Will Be Hard</li></ul></blockquote><p>So far in <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym> Thursday Threads</i> I&#8217;ve intentionally avoided pointing to items inside this blog &#8212; preferring to link to events, resources, and conversations elsewhere.  I&#8217;m going to make sort-of-an-exception in this case because what I&#8217;m ultimately pointing to is not my work.  It is a <a href="http://dltj.org/article/collections-futures/">slidecast (recorded audio synchronized to slides) of David Lewis&#8217; presentation</a> at the <a href="http://www.oclc.org/research/events/2010-06-09a.htm" title="2010 RLG Partnership Annual Meeting Agenda">2010 Annual RLG Partnership Meeting</a>.  Starting with a foundation from John Hagel III, John Seely Brown and Lang Davison called the &#8220;<a href="http://www.johnhagel.com/shiftindex.pdf" title="Measuring the forces of long-term change: The 2009 Shift Index">Shift Index</a>&#8221; [PDF], Clay Shirky&#8217;s “<a href="http://www.ted.com/talks/clay_shirky_how_cellphones_twitter_facebook_can_make_history.html" title="Clay Shirky: How social media can make history | Video on TED.com">How Social Media Can Make History</a>” TED Talk, and Clayton Christensen&#8217;s disruptive innovation theories, David walks through the possibilities for three strategic issues facing academic libraries:  Complete the migration from print to electronic collections; Retire legacy print collections; and Migrate the focus of collections from purchasing materials to curating content.  The slidecast is about 75 minutes long and well worth the time as a thought-provoking view of what libraries should be doing to survive the next few decades.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/thursday-threads-2011w9/feed/</wfw:commentRss> <slash:comments>6</slash:comments> </item> <item><title>Thursday Threads: Open Publishing Alternatives, Open Bibliographic Data, Earn an MBA in Facebook, Unconference Planning</title><link>http://dltj.org/article/thursday-threads-2010w48/</link> <comments>http://dltj.org/article/thursday-threads-2010w48/#comments</comments> <pubDate>Fri, 03 Dec 2010 02:17:31 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Thursday Threads]]></category> <category><![CDATA[Creative Commons]]></category> <category><![CDATA[Diane Hillman]]></category> <category><![CDATA[ebooks]]></category> <category><![CDATA[ejournal]]></category> <category><![CDATA[Functional Requirements for Bibliographic Records]]></category> <category><![CDATA[GlueJar]]></category> <category><![CDATA[John Wilkin]]></category> <category><![CDATA[Karen Coyle]]></category> <category><![CDATA[linked data]]></category> <category><![CDATA[OKFN]]></category> <category><![CDATA[Open Bibliographic Data]]></category> <category><![CDATA[Open Library]]></category> <category><![CDATA[publishing]]></category> <category><![CDATA[Resource Description and Access]]></category> <category><![CDATA[semantic web]]></category> <category><![CDATA[unconference]]></category> <category><![CDATA[University of Pittsburgh]]></category><guid isPermaLink="false">http://dltj.org/?p=1880</guid> <description><![CDATA[Receive DLTJ Thursday Threads:&#8226;&#160;by&#160;E-mail&#8226;&#160;by&#160;RSS&#160;Delivered by FeedBurner The highlights of the past week are around publishing &#8212; first with a model proposed by Eric Hellman in which consumers can pool enough money to pay publishers to &#8220;set a book free&#8221; under &#8230; <a href="http://dltj.org/article/thursday-threads-2010w48/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=1880"></abbr><div id="feedburner-thursday-threads-email-w48" class="wp-caption alignright" style="width: 230px;;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><form style="border:1px solid #ccc;padding:3px;margin:0;text-align:center;" action="http://feedburner.google.com/fb/a/mailverify" method="post" target="popupwindow" onsubmit="window.open('http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads', 'popupwindow', 'scrollbars=yes,width=550,height=520');return true"><p>Receive <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym></i> Thursday Threads:</p><p>&bull;&nbsp;by&nbsp;<a href="http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads&#038;loc=en_US" title="D.L.T.J. Thursday Threads Email Subscription">E-mail</a><br /><input type="text" style="width:140px" name="email" value="Your e-mail address" onFocus="if (this.defaultValue==this.value) this.value = ''"/><input type="hidden" value="thursday-threads" name="uri"/><input type="hidden" name="loc" value="en_US"/><input type="submit" value="Subscribe" /></p><p>&bull;&nbsp;by&nbsp;<a href="http://feeds.dltj.org/thursday-threads/" title="D.L.T.J. Thursday Threads RSS Feed">RSS</a>&nbsp;<a href="http://feeds.dltj.org/thursday-threads/" title="D.L.T.J. Thursday Threads RSS Feed"><img src="http://cdn.dltj.org/wp-content/uploads/2010/12/feed-icon32x32.png" alt="RSS Icon" width="12" height="12" /></a></p><p style="font-size: 80%">Delivered by <a href="http://feedburner.google.com" target="_blank" title="Google Feedburner Service">FeedBurner</a></p></form></div><p> The highlights of the past week are around publishing &#8212; first with a model proposed by Eric Hellman in which consumers can pool enough money to pay publishers to &#8220;set a book free&#8221; under a Creative Commons license, then with an announcement by the University of Pittsburgh offering free hosting of open access e-journals.  Since we have to be able to describe and find this content, their bibliographic descriptions are important; John Wilkin proposes a model for open access to elements of bibliographic descriptions.  Rounding out this week&#8217;s topics are a report of a master&#8217;s degree program in business using Facebook, and tips for planning an unconference meeting.</p><p><h2><a name="paying_publishers">Paying Publishers to Set their Content Free</a></h2></p><blockquote><p>[Eric] Hellman’s new model is something he calls GlueJar.  He proposes to “unglue” e-books from their publishers so that they can be available to the world, DRM-free and under Creative Commons license.  Here’s the model: publishers sign on with works that they want to “unglue.”  They determine what they are willing to be paid for ungluing each work.  Users contribute money towards the ungluing.  When the threshold amount is reached for a given title, that title is unglued: it appears in all contributors’ e-book reader libraries and in repositories used for online public library access.  The publisher is paid, and GlueJar takes a commission.</p><p>In other words, publishers just need to determine a price for content being taken off their hands, and if the public is willing to pay that price, it happens.  (Users aren’t charged until works they want to unglue are unglued.)  No more transaction costs; anyone can distribute the content to anyone else.  Publishers could possibly retain subsidiary rights to the content, such as print on demand or derivative work rights.</p></blockquote><p>Bill Rosenblatt of the Copyright and Technology blog looks at the problem publishers have of <a href="http://copyrightandtechnology.com/2010/11/09/paying-publishers-to-set-their-content-free/" title="Paying Publishers to Set their Content Free | Copyright and Technology">finding good content creators and having a model that makes that content widely available</a>.  Towards the end of his post, he summarizes Eric Hellman&#8217;s <a href="http://go-to-hellman.blogspot.com/2010/10/business-idea-4-ungluing-ebooks.html" title="Business Idea #4: Ungluing eBooks | Go To Hellman">proposed model for &#8220;ungluing ebooks&#8221;</a> in a way that makes sense for creators, publishers, and consumers.  So far as I know, no one has taken Eric up on a trial of his model, but I think it would be interesting to see if it was practical.  [Found via OCLC Research's <a href="http://www.oclc.org/research/publications/newsletters/abovethefold/2010-11-24.htm" title="OCLC's Above the Fold - November 24, 2010">Above the Fold</a>.]</p><p><h2><a name="upitt_ejournal_hosting">University of Pittsburgh Library System Offers Free E-Journal Publishing Service</a></h2></p><blockquote><p>Pitt’s <a href="http://www.library.pitt.edu/" title="University of Pittsburgh Library System">University Library System</a> (ULS) is now offering free e-journal publishing services to help academic journals make their content available to a global audience while eliminating the cost of print production.&nbsp;</p><p>The E-journal Publishing Program—part of ULS’ <a href="http://www.library.pitt.edu/dscribe/" title="University of Pittsburgh D-Scribe Digital Publishing Program">D-Scribe Digital Publishing Program</a>, which partners with the University of Pittsburgh Press—“is in keeping with the ULS’ commitment to free and immediate access to scholarly information and its mission to support researchers in the production and sharing of knowledge in a rapidly changing publishing industry,” said Rush G. Miller, Hillman University Librarian and director of the ULS.&nbsp;</p><p>The ULS trains a journal’s editorial staff in the use of Open Journal Systems (OJS) software, which channels the flow of scholarly content from initial author submissions through peer review and final online publication and indexing. OJS provides the tools necessary for the layout, design, copy editing, proofreading, and archiving of journal articles. The platform provides a vast set of reading tools to extend the use of scholarly content through RSS feeds and postings to Facebook and Twitter. E-journal articles can be discovered via blogs, databases, search engines, library collections, and other means.&nbsp;</p></blockquote><p>The University of Pittsburgh <a href="http://www.news.pitt.edu/news/university-pittsburgh-library-system-offers-free-e-journal-publishing-service" title="University of Pittsburgh Library System Offers Free E-Journal Publishing Service | University of Pittsburgh News">announced</a> that it is offering the <a href="http://www.library.pitt.edu/e-journals/tools.html" title="Tools and Services&amp;gt; D-Scribe Digital Publishing">infrastructure</a> for managing and hosting electronic journals with an at-cost print-on-demand supplement. Since the cost of the digital publishing platform is absorbed by the University of Pittsburgh and since peer review is typically done at no cost, what&#8217;s left on the expense side of the balance sheet? Paying the editorial staff? Marketing and advertising the journal?  Has the University of Pittsburgh tipped the equation enough to make this model viable?</p><p><h2><a name="open_bib_data">Open Bibliographic Data: How Should the Ecosystem Work?</a></h2></p><blockquote><p>In the conversations about openness of bibliographic data, I often find myself in an odd position, vehemently in support of it but almost as vehemently alarmed at the sort of rhetoric that circulates about the ways that data should be shared.</p><p>The problem with both the arguments OCLC makes and many of the arguments for openness seem to be predicated on the view that bibliographic data are largely inert, lifeless “records” and that these records are the units that should be distributed and consumed.</p><p>Nothing could be further from the truth.</p></blockquote><p>The above quote is just one small piece of a <a href="http://blog.okfn.org/2010/11/29/open-bibliographic-data-how-should-the-ecosystem-work/" title="Open Bibliographic Data: How Should the Ecosystem Work? | Open Knowledge Foundation Blog">posting by John Wilkin</a> on the Open Knowledge Foundation blog.  In it he plants a flag for the library profession to drive towards with bibliographic data that is published in a fine-grained, easily recombined manner.  In being too focused on silos of &#8220;lifeless records&#8221; (WorldCat, local ILSs, <a href="http://openlibrary.org/" title="Internet Archive's Open Library">Open Library</a>, etc.), he suggests that the profession is missing out on ways we (and our users!) can combine and enhance bibliographic data.  John&#8217;s statement is in parallel with a growing movement towards linked data, a movement that encompasses a reinvigorating of bibliographic description using <acronym title="Functional Requirements for Bibliographic Records">FRBR</acronym> and <acronym title="Resource Description and Access">RDA</acronym> (the current and progressive best thinking of the library community) with the foundational elements of the &#8220;semantic web&#8221; vision.  For more on the latter, see the work of the <a href="http://www.w3.org/2005/Incubator/lld/" title="W3C Library Linked Data  Incubator Group">W3C-supported Library Linked Data Incubator Group</a> and <a href="http://www.dlib.org/dlib/january10/hillmann/01hillmann.html" title="http://www.dlib.org/dlib/january10/hillmann/01hillmann.html">the</a> <a href="http://www.dlib.org/dlib/january07/coyle/01coyle.html" title="Resource Description and Access (RDA): Cataloging Rules for the 20th Century">work</a> of Karen Coyle and Diane Hillman, among others.</p><p>On a related note, the JISC community in the UK has also published the <a href="http://obd.jisc.ac.uk/" title="Open Bibliographic Data Guide">Open Bibliographic Data Guide</a>.  &#8220;It is about the business cases for Open Bibliographic Data – releasing some or all of a library’s catalogue records for open use and re-use by others.&#8221;</p><p><h2><a name="facebook_mba">Poking, Tagging and Now Landing an M.B.A</a></h2></p><blockquote><p>But thanks to a pair of young British entrepreneurs, students who do want both a business education and the credential to prove it can now pursue their studies at the same time as they “poke” their friends, tag photos, update their relationship status or harvest their virtual crops on FarmVille.</p><p>The London School of Business and Finance Global M.B.A. bills itself as “the world’s first internationally recognized M.B.A. to be delivered through a Facebook application.”</p></blockquote><p>Hmm &#8212; meet the students where they are? This <a href="http://www.nytimes.com/2010/11/29/education/29iht-educlede29.html" title="http://www.nytimes.com/2010/11/29/education/29iht-educlede29.html">story from the New York Times</a> outlines an MBA program that is fully immersed in the Facebook environment.  I wonder if the completion rate of a Facebook-based program will be higher than that of other online systems because users spend more time in the Facebook environment. [Via <a href="http://keptup.typepad.com/academic/2010/11/earn-your-mba-on-facebook.html" title="The Kept-Up Academic Librarian: Earn Your MBA On Facebook">Steven Bell</a>]</p><p><h2><a name="unconference_planning">How I Planned a Successful Unconference in 6 hours &#8211; and You Can Too</a></h2></p><blockquote><p>Last Friday I ran WhereCamp5280 in Denver, which attracted over 70 people (many from out of state and a couple from Canada), used thousands of dollars from top-tier sponsors and was organized in probably less than six hours total. An unconference is a conference in the loosest of terms. People show up, we build our own agenda and then go for it. Here I&#8217;ll describe how it was run.</p></blockquote><p>Steve Coast, a guest author for ReadWriteWeb, give this <a href="http://www.readwriteweb.com/hack/2010/11/how-i-planned-a-successful-unconference-in-6-hours---and-you-can-too.php" title="How I Planned a Successful Unconference in 6 hours - and You Can Too">how-to guide for planning an unconference</a>.  An <a href="http://en.wikipedia.org/wiki/Unconference" title="Unconference - Wikipedia, the free encyclopedia">unconference</a> is a relatively new style of event where the content of the meeting is defined by the people who show up and participate.  The common guidelines for such meetings<sup><a href="http://dltj.org/article/thursday-threads-2010w48/#footnote_0_1880" id="identifier_0_1880" class="footnote-link footnote-identifier-link" title="These rules are common, but I found them most clearly expressed at the Scratchpad Wikia.">1</a></sup> are: 1) The people who come are the best people who could have come; 2) Whatever happens is the only thing that could have happened; 3) It starts when it starts; 4) It&#8217;s over when it&#8217;s over; and 5) Exercise the Law of Two Feet.  The last might take some more explanation; it means: &#8220;If you are not learning or contributing to a talk or presentation or discussion it is your responsibility to find somewhere where you can contribute or learn.&#8221;</p><p>In my experience, the unconference format is great if you want a group to brainstorm around a central idea or if you want to promote professional networking connections among a group.  If you are looking for a particular outcome or have a specific agenda, this format does not work well.</p><h2>Footnotes</h2><ol class="footnotes"><li id="footnote_0_1880" class="footnote">These rules are common, but I found them most clearly expressed at the <a href="http://scratchpad.wikia.com/wiki/UnConference_'Rules'" title="UnConference 'Rules - Scratchpad Wiki Labs - Free wikis from Wikia">Scratchpad Wikia</a>.</li></ol>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/thursday-threads-2010w48/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> <item><title>Amazon Catalog Updates</title><link>http://dltj.org/article/amazon-catalog-updates/</link> <comments>http://dltj.org/article/amazon-catalog-updates/#comments</comments> <pubDate>Thu, 13 May 2010 02:19:19 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Blue Sky]]></category> <category><![CDATA[Amazon]]></category> <category><![CDATA[crowdsourcing]]></category> <category><![CDATA[metadata]]></category> <category><![CDATA[Open Library]]></category><guid isPermaLink="false">http://dltj.org/?p=1564</guid> <description><![CDATA[Did you know that Amazon offers a facility to make corrections to its catalog? Somewhere in the past few months someone mentioned this to me and I tried it out. (Unfortunately, it has been long enough now that I&#8217;ve forgotten &#8230; <a href="http://dltj.org/article/amazon-catalog-updates/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=1564"></abbr><p>Did you know that Amazon offers a facility to make corrections to its catalog?  Somewhere in the past few months someone mentioned this to me and I tried it out.  (<del datetime="2010-05-14T13:34:39+00:00">Unfortunately, it has been long enough now that I&#8217;ve forgotten who told me; if you are the one, please fess up in <a href="http://dltj.org/article/amazon-catalog-updates/#respond">this post&#8217;s comments section</a>.</del> <ins datetime="2010-05-14T13:34:39+00:00">It was Ron Murray from the Library of Congress.  Thanks, Ron!</ins>)  And it works!  Is this a model for crowdsourced corrections to library data?<br /><span id="more-1564"></span><br />Here is how it looks from a user&#8217;s perspective.</p><p><h2>Step 1. Finding something to correct</h2><br />Amazon has a pretty good catalog, so for the purposes of demonstrating this feature it took a while to find a record to correct.  I used the suggestions from <a href="http://librarytypos.blogspot.com/" title="Typo of the day for librarians">Typo of the Day for Librarians</a> for ideas of errors to look for in the Amazon catalog.  One of the suggested typos was <a href="http://librarytypos.blogspot.com/2010/03/sucess-etc-for-success-etc.html" title="Typo of the day for librarians: Sucess*, etc. (for Success, etc.)">Sucess*, etc. (for Success , etc.)</a>, and I found a record for <a href="http://www.amazon.com/How-Talk-Anyone-Success-Relationships/dp/1593160267/" title="Amazon product page for &#039;How to Talk to Anyone&#039;">How to Talk to Anyone: 62 Little Tricks for Big Success in Relationships</a> in audio CD format with this misspelling.  As this image shows, the original title was &#8220;How to Talk to Anyone: 62 Little Tricks for Big <em>Sucess</em> in Relationships&#8221;<br /><div id="attachment_1583" class="wp-caption aligncenter" style="width: 682px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-page-with-typo.png"><img src="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-page-with-typo-cropped.png" alt="" title="Amazon page for &#039;How to Talk to Anyone&#039; with typo" class="size-full wp-image-1583" width="672" height="396"/></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Amazon page for 'How to Talk to Anyone' with typo</p></div></p><p><h2>Step 2. Making the Correction</h2><br />In the &#8220;Product Details&#8221; section of the Amazon catalog page is a link to &#8220;update product info&#8221;<br /><div id="attachment_1586" class="wp-caption aligncenter" style="width: 682px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-page-with-typo.png"><img src="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-page-with-typo-cropped-2.png" alt="" title="Excerpt of Amazon product information page with the &#039;update product info&#039; link highlighted" class="size-full wp-image-1586" width="672" height="328"/></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Excerpt of Amazon product information page with the 'update product info' link highlighted</p></div><br />Following that link takes you to a form that is prefilled with all of the information from the Amazon catalog.  You can make your corrections here and provide citation URLs to reference the source of the correct information.  (In the excerpt of the form on this page only the Title and Reference sections are show.  Click through the image to see the full version of the form.)<br /><div id="attachment_1587" class="wp-caption aligncenter" style="width: 830px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-catalog-update-form.png"><img src="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-catalog-update-form-cropped.png" alt="" title="Excerpt of Amazon Catalog Update Form" class="size-full wp-image-1587" width="820" height="573"/></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Excerpt of Amazon Catalog Update Form</p></div><br />You are given a chance to preview your changes before submitting them.  Note in this case that the reference URL I&#8217;m using is actually a link to the cover image for this item at Amazon.  A bit of neat symmetry there, I figure.<br /><div id="attachment_1589" class="wp-caption aligncenter" style="width: 830px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-catalog-update-preview.png"><img src="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-catalog-update-preview-cropped.png" alt="" title="Preview of Amazon Catalog Updates" class="size-full wp-image-1589" width="820" height="400"/></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Preview of Amazon Catalog Updates</p></div><br />After submitting the changes, you get a nice &#8220;thank you&#8221; from Amazon for making their service better.<br /><div id="attachment_1590" class="wp-caption aligncenter" style="width: 830px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;"><a href="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-catalog-update-submitted.png"><img src="http://cdn.dltj.org/wp-content/uploads/2010/05/amazon-catalog-update-submitted-cropped.png" alt="" title="Submission confirmation page from Amazon Catalog Update service" class="size-full wp-image-1590" width="820" height="145"/></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Submission confirmation page from Amazon Catalog Update service</p></div></p><p><h2>Step 3. Getting Confirmation from Amazon</h2><br />After a bit &#8212; mere hours in my case &#8212; Amazon will send you a confirmation back that the correction has been accepted.</p><blockquote><p>From: &#8220;gfix-noreply@amazon.com&#8221; <gfix -noreply@amazon.com><br />To: &#8220;peter@OhioLINK.edu&#8221;<peter @OhioLINK.edu&gt;<br />Subject: Your Amazon.com Catalog Update Request</p><p>==== This is an automated response message - please do not reply ====</p><p>Thank you for using the Catalog Update Form to send suggestions for</p><p>How to Talk to Anyone: 62 Little Tricks for Big Sucess in Relationships (ASIN 1593160267)</p><p>Your update has been accepted and processed. It will appear online within the next two to three business days.<br />Attribute: Title<br />Current value:<br />How to Talk to Anyone: 62 Little Tricks for Big Sucess in Relationships</p><p>Your suggestion:<br />How to Talk to Anyone: 62 Little Tricks for Big Success in Relationships</p><p>Data accuracy is highly important to us. We appreciate the time you have taken to submit your updates to us.</p><p>Best regards,</p><p>Catalog Department<br />www.amazon.com</p></blockquote><p>And if you go to this <a href="http://www.amazon.com/How-Talk-Anyone-Success-Relationships/dp/1593160267/" title="http://www.amazon.com/How-Talk-Anyone-Success-Relationships/dp/1593160267/">product page now</a> you&#8217;ll see the title has been corrected.</p><p><h2>Would this Work for Libraries?</h2><br />Now Amazon must have some resources backing up this service to do the verification of submissions.  And it makes sense for them because corrected metadata makes it easier for their products to be found and purchased.  If libraries were to consider providing an equivalent service for our metadata, could we justify the costs?  Is this a good use of our time and effort?</p><p>If we were to do it, I think it might have to be done by a bibliographic utility like OCLC who has ways to push the updated records to member libraries.  Otherwise we run the risk of diluting the corrections across many individual library catalogs.  Interestingly, this sort of user-generated correction facility one that the <a href="http://openlibrary.org/" title="Open Library homepage" rel="homepage">Open Library</a> already provides. (Open Library is a wiki-like service that offers the ability for anyone to make changes to its records, much like how <a href="http://en.wikipedia.org/wiki/Welcome_to_Wikipedia" title="http://en.wikipedia.org/wiki/Welcome_to_Wikipedia">anyone can edit articles on Wikipedia</a>.)  So between Amazon and Open Library there is a continuum of workflows of mediated corrections to unmediated corrections for us to consider.  This scheme, of course, begs us to consider the notion of <a href="http://journal.code4lib.org/articles/86" title="The Code4Lib Journal &amp;#8211; Distributed Version Control and Library Metadata">distributed version control systems for handling our bibliographic data</a> so that changes can be merged across many sources.</p><p>Lots to think about&#8230;</peter></gfix></p></blockquote>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/amazon-catalog-updates/feed/</wfw:commentRss> <slash:comments>20</slash:comments> </item> <item><title>Mashups of Bibliographic Data: A Report of the ALCTS Midwinter Forum</title><link>http://dltj.org/article/mashups-of-bib-data/</link> <comments>http://dltj.org/article/mashups-of-bib-data/#comments</comments> <pubDate>Wed, 27 Jan 2010 21:14:52 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Meeting]]></category> <category><![CDATA[ALA Midwinter Conference 2010]]></category> <category><![CDATA[Association for Library Collections and Technical Services]]></category> <category><![CDATA[Dewey Decimal Classification]]></category> <category><![CDATA[Google Book Search]]></category> <category><![CDATA[Internet Archive]]></category> <category><![CDATA[MARC]]></category> <category><![CDATA[OCLC]]></category> <category><![CDATA[onix]]></category> <category><![CDATA[Open Library]]></category> <category><![CDATA[WorldCat]]></category><guid isPermaLink="false">http://dltj.org/?p=1478</guid> <description><![CDATA[This year the ALCTS Forum at ALA Midwinter brought together three perspectives on massaging bibliographic data of various sorts in ways that use MARC, but where MARC is not the end goal. What do you get when you swirl MARC, &#8230; <a href="http://dltj.org/article/mashups-of-bib-data/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=1478"></abbr><p>This year the <a href="http://connect.ala.org/node/91406" title="ALCTS Forum: Mix and Match: Mashups of Bibliographic Data | ALA Connect"><acronym title="Association for Library Collections and Technical Services">ALCTS</acronym> Forum at <acronym title="American Library Association">ALA</acronym> Midwinter</a> brought together three perspectives on massaging bibliographic data of various sorts in ways that <em>use</em> <acronym title="Machine Readable Cataloging">MARC</acronym>, but where MARC is not the end goal.  What do you get when you swirl MARC, <acronym title="ONline Information eXchange">ONIX</acronym>, and various other formats of metadata in a big pot?  Three projects:  ONIX Enrichment at OCLC, the Open Library Project, and Google Book Search metadata.<br /><span id="more-1478"></span><br />Below is a summary of how these three projects are messin&#8217; with metadata, as told by the Forum panelists.  I also recommend reading Eric Hellman&#8217;s <a href="http://go-to-hellman.blogspot.com/2010/01/google-exposes-book-metadata-privates.html" title="Google Exposes Book Metadata Privates at ALA Forum | Go-to-Hellman">Google Exposes Book Metadata Privates at ALA Forum</a> for his recollection and views of the same meeting.</p><p><h2 id="post-1478-h2-OCLC-ONIX">ONIX Enrichment at OCLC</h2></p><p><span class="removed_link" title="http://www.oclc.org/speakers/bios/register_renee.htm">Renee Register</span>, Global Product Manager for OCLC Cataloging and Metadata Services, was the first to present on the panel.  Her talk looked at a new and evolving product at OCLC on the enhancement of ONIX records with WorldCat records, and vice versa. <sup><a href="http://dltj.org/article/mashups-of-bib-data/#footnote_0_1478" id="identifier_0_1478" class="footnote-link footnote-identifier-link" title="For those not familiar with ONIX, it is a suite of standards promulgated by EDItEUR for the interchange of information on books and serial publications.  It is primarily used as the communication channel between the publishing industry through distribution chains to retail establishments.">1</a></sup></p><p>As libraries, Renee said &#8220;our instincts are collaborative&#8221; but &#8220;our data and workflow silos encourage redundancy and inhibit interoperability.&#8221;  Beyond the obvious differences in metadata formats, the workflows of libraries differ dramatically from other metadata providers and consumers. In libraries (with the exception of <acronym title="Cataloging in Print">CIP</acronym> and brief on-order records) the major work of bibliographic production is performed at the end of the publication cycle and ends with the receipt of the published item.  In the publisher supply chain, bibliographic data evolves over time, usually beginning months before publication and continuing to grow for months and years (sales information, etc.) after publication.  Renee had a graphic showing the current flow of metadata around the broader bibliographic universe that highlighted the isolation of library activity relative to publisher, wholesaler, and retailer activity.</p><p><div id="attachment_1484" class="wp-caption alignright" style="width: 310px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><a href="http://www5.oclc.org/downloads/presentations/MDS4Pubs_August_Webinar_200908.ppt" title="Slides from Publisher Supply Chain Webinar, August 2009"><img src="http://cdn.dltj.org/wp-content/uploads/2010/01/ONIX-enhancement-300x225.jpg" alt="" title="Diagram of the Process of Enhancing ONIX Records" width="300" height="225" class="size-medium wp-image-1484" /></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Diagram of the Process of Enhancing ONIX Records, from OCLC Services for the Publisher Supply Chain Webinar, August 2009</p></div>Renee when on to describe a &#8220;next generation cataloging data flow&#8221; where OCLC facilitates the inclusion of publisher data into <a href="http://www.worldcat.org/" title="WorldCat homepage" rel="homepage">WorldCat</a> and enhances publisher data with information extracted from WorldCat.  To the right is a version of the graphic she used at Midwinter taken from an earlier presentation on the same topic.  It show ONIX-formatted metadata coming into WorldCat, being cross-walked and matched with existing MARC data in WorldCat, and finally extracted and cross-walked back to ONIX resulting in <a href="http://publishers.oclc.org/en/metadata/default.htm" title="OCLC Metadata Services for Publishers"> enhanced ONIX metadata</a> for publishers to use in their supply chain.  If there is an exact match for the incoming ONIX record in WorldCat, the WorldCat record is enhanced with certain fields from the ONIX record (descriptions, author biographies, web links) &#8212; being careful not to override authority work being done by libraries, but adding enhancements that libraries may not otherwise input.  In turn, enhancements from exact match record and FRBR work set records (hardcover versus softcover versus audiobook, etc.) are added to the ONIX record (non-English subject headings, adding a Dewey Decimal Classification (DDC) field from another similar record if one doesn&#8217;t already exist, change the author field to an authority-controlled version).  If there is not an exact match for the ONIX record in WorldCat, a new WorldCat record is built from the ONIX record and it is subsequently enhanced by metadata found in the FRBR work set records.  In doing so, we are &#8220;increasing the goodness of metadata in the marketplace,&#8221; as Renee put it in her presentation.  OCLC is also creating a mapping between <a href="http://www.bisg.org/what-we-do-20-73-bisac-subject-headings-2009-edition.php" title="Standards &amp; Best Practices | Classification Schemes | BISAC Subject Headings 2009 Edition | Book Industry Study Group">BISAC Subject Headings</a><sup><a href="http://dltj.org/article/mashups-of-bib-data/#footnote_1_1478" id="identifier_1_1478" class="footnote-link footnote-identifier-link" title="By the way, it seems like BISAC is an acronym for &amp;#8220;Book Industry Systems Advisory Committee&amp;#8221;, the former name of the Book Industry Study Group.">2</a></sup> and the DDC system.  This allows the enhancement of ONIX with suggestions of BISAC Subject Terms and the enhancement of WorldCat records with generic DDC fields given an incoming BISAC Subject Term value from the ONIX record.</p><p>In her experience, Renee said that libraries need ways to enable our metadata to evolve over time and allow for publisher-created metadata to merge effectively with library-created metadata.  The bibliographic record needs to be a &#8220;living, growing&#8221; thing throughout the lifecycle of a title and beyond.  In concluding her remarks, she offered several resources to explore for further information:  the OCLC/NISO study on <a href="http://www.niso.org/publications/white_papers/StreamlineBookMetadataWorkflowWhitePaper.pdf" title="Streamlining Book Metadata Workflow">Streamlining Book Metadata Workflow</a>, the U.K. Research Information Network report on <a href="http://rin.ac.uk/creating-catalogues" title="Creating Catalogues: Bibliographic Records in a Networked World">Creating Catalogues: Bibliographic Records in a Networked World</a>, the Library of Congress <a href="http://www.loc.gov/bibliographic-future/news/" title="News, Press Releases and Reports - Working Group on the Future of Bibliographic Control (Library of Congress)">Study of the North American MARC Records Marketplace</a>, the Library of Congress <a href="http://cip.loc.gov/onixpro.html" title="LC ONIX Pilot Project" class="broken_link" rel="nofollow">CIP/ONIX Pilot Project</a>, and the <a href="http://publishers.oclc.org/en/default.htm" title="OCLC Publisher Supply Chain Website">OCLC Publisher Supply Chain Website</a>.</p><p><h2 id="post-1478-h2-Open-Library">From MARC to Wiki with Open Library</h2><br />The second presenter on the panel was <a href="http://kcoyle.net/" rel="homepage" title="Karen Coyle's home page">Karen Coyle</a>, talking about the mashup of metadata at the <a href="http://openlibrary.org/" title="Open Library project homepage" rel="homepage">Open Library</a> project at the <a href="http://archive.org/" title="Internet Archive homepage" rel="homepage">Internet Archive</a>.  The slides from her presentation are <a href="http://kcoyle.net/presentations/ol_boston.pdf" title="Open Library - Mix and Match Metadata presentation slides [PDF]">available from her website</a>.</p><p>Karen said right at the start that the Open Library project is different from most of what happens in libraries &#8212; it is &#8220;someone outside the library world making use of library data&#8221; &#8212; although the goal is arguably the same as others &#8212; &#8220;<a href="http://openlibrary.org/about" title="About Us (Open Library)">One web page for every book ever published</a>.&#8221;  As such, the Open Library isn&#8217;t a library catalog as librarians think of it in that it is not a representation of a libraries inventory. It has metadata for every book it can know about and a pointer to places where the book can be found, including all of the electronic books in Internet Archive (<a href="http://www.opencontentalliance.org/" rel="homepage" title="Open Content Alliance (OCA)">Open Content Alliance</a>, Google Public Domain, etc.) as well as pointers back to OCLC WorldCat.  Karen&#8217;s role for the project is that of &#8220;Library Data Informant.&#8221; The Internet Archive decided that they needed someone who understood library data in order to try to use it.  From Karen&#8217;s perspective, she is trying to be a resource for project but not give them any guidance on how to implement the service.  She is curious to see what the project would do when bibliographic data is viewed from a non-librarian perspective.  If they have questions, or if they have assumptions about data that are wrong, then she intervenes.</p><p>Karen went on to briefly describe the Open Library system.  Open Library doesn&#8217;t have records; rather, it has field types and data properties.  In this way, it uses semantic web concepts.  &#8220;Author&#8221; is a type, &#8220;Author birthdate&#8221; is another type, and so forth.  There are no set field types, so if the project gets data from source for which a type doesn&#8217;t yet exist, it can create a new one.  Each type can have data properties such as string, boolean, text, link, etc.  Nothing is required and everything is repeatable.  Everything &#8212; types, properties, and values &#8212; gets a <acronym title="Uniform Resource Identifier">URI</acronym> (a URI is an identifier like a URL, but conceptually a superset of the universe of URLs).  Titles, authors, subjects, author birthdates, and so on have URIs.  Lastly, the underlying data structures are based on wiki principles: all edits are saved and viewable, anyone can edit any value, anyone can add new types or properties, anyone can develop their own displays, etc.</p><p>The data that is now in Open Library came from a variety of sources.  They started with a copy of books from the Library of Congress, and continue to receive the weekly updates. They performed a crawl of Amazon&#8217;s book data.  They have gotten some from publishers, libraries, and individual users.  The last is perhaps the most interesting because it is mainly people outside the western world who are otherwise having trouble getting their works recognized.</p><p><h3 id="post-1478-h3-Problems-Issues">Problems, Issues, Challenges, and Opportunities with the Data</h3><br />People who use library data without the biases or assumptions of librarians come up with interesting ways to view the data.  Karen described a few of them.</p><dl class="inlineClass"><dt>Names -</dt><dd>&#8220;These library forms of names? Honestly no one but us can stand them.&#8221;  Even something as simple as the form of last-name-comma-first-name is troublesome.  No one else uses this form of the name: Amazon, Wikipedia, etc.  In processing these, any information between parenthesis has been deleted, birth and death dates move into separate field types.</dd><dt>Titles -</dt><dd>In working with the Open Library developers, this is one place that Karen tried insisting on applying a library practice:  knowing the initial article.  For us, this is important for sorting books in alphabetical order.  The developer response &#8212; why do we have to sort in alphabetical order?  &#8220;Where else but library catalogs to we see things sorted in alphabetical order?  Not in Google, not in Amazon, not anywhere.  Alphabetical order is not in the mindset anymore.&#8221;  They also found that the title might include extraneous data.  Amazon, for instance, appends the series title in parenthesis to the main title.  This is a demonstration of how other communities are not as concerned about strongly typing and separating information into fields. Amazon, of course, has reasons for series information into the main title: it helps sell books.</dd><dt>Product dimensions -</dt><dd>Publishers and distributors need to know characteristics of an item such as height, width, depth, and weight; they, of course, need to put it in a box and ship it.  Libraries, concerned about placing the item on the shelf, record just height.  Recording pagination is different, too: libraries use odd notations &#8220;ill. (some col)&#8221; and &#8220;xv, 200p.&#8221; versus simply &#8220;200 pages.&#8221;</dd><dt>Birthdates -</dt><dd>Librarians use birthdates to distinguish names; if there is no need to distinguish a name, birth and death dates are not added.  Someone looking at this from the outside would ask &#8216;Why don&#8217;t all authors have birth and death dates?&#8217;  This can be useful information for viewing the context of an item, not just to distinguish author names.  Open Library ran author names against Wikipedia to pick up not only birth and death years, but also the actual dates.</dd><dt>Subject headings -</dt><dd>Open Library using Library of Congress Subject Headings was out of the question. In processing the data, the Open Library developers just broke them apart into segments and used them. But because they were able to do data mining on the subject field types, they did find statistical relationships between the disassembled precoordinated headings and were able to present those to the user.</dd><dt>The View of the Data -</dt><dd>Rather than a traditional library view of long lists of author-title, the Open Library (in its next version coming in February) will have several different views into the mass of data: Authors; Books (what we would call <acronym title="Functional Requirements for Bibliographic Records">FRBR</acronym> &#8216;manifestations&#8217;); Works; Subjects; and eventually places, publishers, etc.  For example, when searching for an author one would get the author page.  On it would be all of the works from the author as well as other biographical information.  It looks similar to a WorldCat identities page, except it is the actual user interface built into the system.  Similarly, every work will have a page, and at the bottom of it one will see all of the editions of the work.  Also, each subject will have a page, and one will see a list of works with that subject as well as authors who write on that subject.  As Karen said, &#8220;The subject itself becomes an object of interest in the database, not just something that is just tacked on to the bottom of the library record.&#8221;</dd><dt>Data mining -</dt><dd>With the data in this format, it is possible to perform data mining actions against it. For instance, simple data mining such as country of publication, popular places that appear, etc.  When they had the problem of author names &#8212; knowing when to reverse surname and forname &#8212; they ran the names against Amazon and Wikipedia and retained the ones where they found the order of the entry was the same. The Open Library developers are also experimenting with data mining to find publisher names.  Publisher names, of course, vary dramatically, but by using ISBN prefixes they can pull together related items into a &#8220;publisher&#8221; view.</dd></dl><p>Karen suggested watching the <a href="http://edwardbetts.com/ol/" title="Index of /ol">Edward Betts&#8217;s site</a>, one of the developers of the Open Library project with an eye on the data mining aspects.  She said it is fun to look at our data when it can be viewed from this different point-of-view.  She also said to watch out for a new version of the <a href="http://openlibrary.org/" title="Open Library (Open Library)">Open Library website</a> coming in February.</p><p><h2 id="post-1478-h2-Google-Book-Search-Metadata">Google Book Search Metadata</h2><br />The final presenter was <a href="http://www.google.com/profiles/kurt.groetsch" title="Kurt Groetsch's Google Profile">Kurt Groetsch</a>, Technical Collections Specialist at Google where he works to provide understanding and insight into library partner collections and the digitized books from Google.  Kurt said that &#8220;Google has been fairly circumspect over the years about what we do on the Book Search project.&#8221;  He said it was a bit of a cultural legacy from the rest of the company and also possibly an artifact of the copyright litigation, but he is hoping to change that.  His presentation looked at how Google works with book metadata from three vantage points &#8212; the inputs into Google&#8217;s system, parsing by Google&#8217;s algorithms, and analysis and output into the public interfaces.</p><p>On the input side, Google is getting bibliographic metadata from over 100 sources in a variety of formats. MARC records are coming from libraries, union catalogs, commercial providers (OCLC), publishers/retails (one publisher supplies records in MARC format).  Google also gets ONIX records from commercial providers (such as Ingram and Bowker), publishers, and retailers.  Google is especially interested in data from non-U.S. retailers because it is a source of information about books published outside the United States; it helps facilitate discovery of items that they may not otherwise encounter in the <a href="https://books.google.com/partner/">publisher</a> and <a href="http://www.google.com/googlebooks/library.html" title="Google Books Library Project">library</a> programs.  Google also receives records in a variety of &#8220;idiosyncratic formats&#8221; &#8212; for example, publisher-contributed metadata (via the Publisher Partner Program); information associating books with jacket images; name authority records (from LC); reviews; popularity signals (sales data as well as <a name="anonymized_circulation_data">anonymized circulation data</a> from some library partners, useful for feeding into the relevancy ranking algorithm); and internally-generated metadata (for instance, whether a book is commercially available or not).  Google processes all of this information to come up with a single record that describes a book.  At this point they have over 800 million bibliographic records and one trillion bits of information in those records.</p><p>All of these records from all of these sources are processed and remixed with Google&#8217;s parsing algorithms about twice a week.  The first step is to transform the incoming records into a &#8220;less verbose format&#8221; for storage and processing.  It is a SQL-like structure that allows elements of the metadata to be queried.  Records are then parsed to extract specific bits of information, transform the bits as necessary, and write the information to an internal &#8220;resolved records&#8221; data structure (a subset of the data coming from the input formats).  In the presentation, Kurt had examples of how making inferences from data coming from both MARC and ONIX can be troublesome.  Parsing also involves extracting &#8220;bibkeys&#8221; from the records to aid in matching across sources of data.  Four types of identifiers are extracted from bibliographic records: OCLC numbers, <acronym title="Library of Congress Control Numbers">LCCN</acronym>s, ISBNs, and ISSNs.  They provide usually useful signals when matching bibliographic and help with assertions that two records describe the same manifestation.  Google also tries to parse item data when present in records representing multi-volume works, enumeration and chronology.  They will also treat barcode as a form of a &#8220;bibkey&#8221; if they get it from a library.  The parsing algorithm will also split records containing multiple ISBNs representing different product forms (e.g. hardback, paperback, etc.).</p><p>With all of this data parsed into records, Google starts its clustering process where records are examined and attached to each other.  Bibkeys provide significant evidence for relating records to each other, but bibkeys are not always present in a record (non-U.S. records and older records frequently contain no bibkeys).  The algorithms then fall back on text similarity matching using title, subtitle, contributor and other fields such as publisher and publication year.  The results are clusters of records representing the same manifestation. An algorithm then attempts to derive the &#8220;best-of&#8221; record for a single cluster from all of the parsed input records.  This is done in a field-by-field voting process based on the trustworthiness of individual fields from record sources.</p><p>Kurt went into some of the challenges facing the team building the clustering and best-of record creation algorithms.  For instance, in dealing with multivolume works they know of 5 numbering schemas with 3 number types in 15 different languages.  Enumeration is now showing in the public display, but the development team is still working with unparsable item data due to inconsistent cataloging practices between institutions&#8230;and sometimes inconsistencies within an institution.  Another problem is non-unique identifiers. In the current data set ISBN 7899964709 is shared by 75 books and ISBN 7533305353 is associated with 1413 books. There are also poor quality or &#8220;junk records&#8221;.  Kurt said his favorite was &#8220;The Mosaic Navigator&#8221; by Sigmund Freud published in 1939.  These are hard to identify with an algorithm, and they rely on reports of problems that enable the developers to go in and &#8220;kill&#8221; the troublesome record.  Another example is a book by Virginia Woolf where the incoming record had conflicting information; it had two 260 fields that contained different dates (1961, correct, and 1900) with fixed field information that strongly suggested that 1900 was the single date of publication.  When the data problem is systematic, they can identify it and compensate for it.  Kurt&#8217;s example for this case was &#8220;The United States Since 1945&#8243; published in 1899.  This one was highlighted in <a href="http://chronicle.com/article/Googles-Book-Search-A/48245/" title="Google's Book Search: A Disaster for Scholars - The Chronicle Review - The Chronicle of Higher Education">Geoffrey Nunberg&#8217;s criticism of Google Books metadata</a>.  In this case, there was a source of metadata from Brazil that when they didn&#8217;t know the date of publication would use 1899.  When Google went back and looked at the date distribution of books there was a huge spike in 1899.  Once Google knew about it they were able to go in and kill that information from that source of records. <sup><a href="http://dltj.org/article/mashups-of-bib-data/#footnote_2_1478" id="identifier_2_1478" class="footnote-link footnote-identifier-link" title="A side note: Google isn&amp;#8217;t the only one tripped up by this.  If one searches for the ISBN of the item, 0195038487, you get to more than one site that has the same incorrect publication date.  At least Google is attempting to clean up the data!">3</a></sup></p><p>In closing, Kurt said that Google is committed to engaging with the library community on improving metadata and metadata processing.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://www.niso.org/publications/white_papers/Stream lineBookMetadataWorkflowWhitePaper.pdf to http://www.niso.org/publications/white_papers/StreamlineBookMetadataWorkflowWhitePaper.pdf on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://www.oclc.org/speakers/bios/register_renee.htm on February 11th, 2011.</p><h2>Footnotes</h2><ol class="footnotes"><li id="footnote_0_1478" class="footnote">For those not familiar with <a href="http://www.editeur.org/8/ONIX/" title="ONIX Overview">ONIX</a>, it is a suite of standards promulgated by <a href="http://www.editeur.org/" title="EDItEUR homepage" rel="homepage">EDItEUR</a> for the interchange of information on books and serial publications.  It is primarily used as the communication channel between the publishing industry through distribution chains to retail establishments.</li><li id="footnote_1_1478" class="footnote">By the way, it seems like BISAC is an acronym for &#8220;Book Industry Systems Advisory Committee&#8221;, the former name of the <a href="http://www.bisg.org/" title="Book Industry Study Group homepage" rel="homepage">Book Industry Study Group</a>.</li><li id="footnote_2_1478" class="footnote">A side note: Google isn&#8217;t the only one tripped up by this.  If one searches for the ISBN of the item, 0195038487, you get to <a href="http://www.biggerbooks.com/book/9780195038484" title="The United States Since 1945 at BiggerBooks.com -  Leuchtenburg, 9780195038484, History">more</a> <a href="http://www.chegg.com/details/the-united-states-since-1945/0195038487/" title="Chegg.com: The United States Since 1945 by Leuchtenburg">than</a> <a href="http://www.amazon.co.uk/The-United-States-Since-1945/dp/0195038487" title="The United States Since 1945: Amazon.co.uk: Books">one</a> site that has the same incorrect publication date.  At least Google is attempting to clean up the data!</li></ol>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/mashups-of-bib-data/feed/</wfw:commentRss> <slash:comments>23</slash:comments> </item> <item><title>Further Consideration of OCLC Records Use Policy</title><link>http://dltj.org/article/oclc-records-use-policy-2/</link> <comments>http://dltj.org/article/oclc-records-use-policy-2/#comments</comments> <pubDate>Thu, 29 Jan 2009 01:37:26 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[policy]]></category> <category><![CDATA[ALA Midwinter 2009]]></category> <category><![CDATA[Biblios]]></category> <category><![CDATA[copyright]]></category> <category><![CDATA[description]]></category> <category><![CDATA[Google Book Search]]></category> <category><![CDATA[MARC]]></category> <category><![CDATA[OCLC]]></category> <category><![CDATA[Open Library]]></category><guid isPermaLink="false">http://dltj.org/?p=701</guid> <description><![CDATA[At ALA Midwinter, ALCTS sponsored a panel discussion about sharing library-created data inside and outside the library community, with a particular focus on cataloging data. I was honored to be ask to speak on the topic from the perspective of &#8230; <a href="http://dltj.org/article/oclc-records-use-policy-2/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=701"></abbr><p>At ALA Midwinter, ALCTS sponsored a panel discussion about sharing library-created data inside and outside the library community, with a particular focus on cataloging data. I was honored to be ask to speak on the topic from the perspective of a consortial office. This is the second and final post in a series that represents an approximation of what I said on the panel.</p><p>The <a href="http://dltj.org/article/oclc-records-use-policy-1/">first part</a> examined the nature of surrogate records that we create as a means to get users to content.   The post looked at where we get records, how humans and machines can create them, and the rights associated with component data that makes up the records.</p><p><h2>Right to reuse records without restrictions</h2><br />One way to handle the clouded nature of surrogate ownership is to follow the lead of the &Dagger;biblios.net and the Open Library Project:  publish the surrogates with a public domain dedication or with an &ldquo;open data&rdquo; license.  This is going to become increasingly important as the variety of systems that use this kind of data evolve and in some cases move outside the library space.</p><p>The first area where is important is with discovery layers.  A new generation of discovery layers are taking surrogates from a variety of sources &ndash; catalogs, publishers, index/abstract services, etc. &ndash; and performing actions such as consolidating records and building relationships between surrogates.  These derived surrogates are presented to users in new interfaces or new portals into existing interfaces.  Examples of these systems are the <a href="http://www.extensiblecatalog.org/" title="About the eXtensible Catalog project">Extensible Catalog project</a> at the University of Rochester and the newly announced <a href="http://www.serialssolutions.com/summon/" title="Summon from Serials Solutions">Serials Solutions Summon product</a>.  OhioLINK recently <a href="http://dltj.org/article/discovery-layer-itn/">solicited responses from vendors</a> where this kind of capability is a key product of a new discovery layer.  Other projects (such as subject-specific portals) also seek to re-purpose the data &ndash; mix it up with other sources of data to create new uses and views that are specific to a particular user community.  Anything other than a permissive-for-all-by-default will put up roadblocks and cause builders of these systems to seek data from other services.</p><p>In addition to presenting the surrogates to users in new ways, libraries are also investigating new forms of workflow and collaborative activities surrounding the creation and maintenance of bibliographic records.  One of the strong desires of many involved in the <a href="http://oleproject.org/" title="The OLE Project homepage">OLE Project</a> is cooperative purchasing and cooperative technical services.  OhioLINK has also recently issued an RFI seeking new options for highly collaborative workflows in the maintenance of surrogate records.  Old models of charging for use of records can hinder the ability of cooperating institutions to optimize costs and efforts of back-room library options.</p><p>The elephant in the room is the recently proposed OCLC Records Use Policy.  Setting aside the debatable legal framework under which OCLC asserted the right to set a usage policy on records from the cooperative, there were clauses in the proposed policy that jeopardize the usability of records, and as a consequence the viability of the cooperative as a whole.  Actions that restrict use of data or create uncertainty around the use of data lessen the value of that data.  I think few would argue that value can be created by aggregating services on top of the data; the activities in the <a href="http://www.worldcat.org/devnet/wiki/Services" title="Services - WorldCat Developers&#039; Network">WorldCat Grid</a> and <a href="http://www.worldcat.org/devnet/index.php/Main_Page" title="Main Page - WorldCat Developers&#039; Network">Developer&rsquo;s Network</a> point to that.  Revenue could be generated in fees charged to non-cooperative members.  It is conceptually important to separate the hosting of the surrogates from the layered services on top of them &ndash; WorldCat Local, mediated ILL, collection analysis, by way of example.</p><p><h2>Parting thoughts</h2><br />OCLC was <a href="http://www.oclc.org/about/history/default.htm" title="History of OCLC">created forty years ago</a> based on the use of new technologies and relationships that technology enabled.  While we all want the cooperative to exist and flourish, it should not do so by engaging in activities that solely protect it.   Portions of the proposed policy appear to mandate that OCLC be in the middle of any exchange of records.  While one can appreciate the ability of a large web footprint like &ldquo;<a href="http://www.worldcat.org/" title="WorldCat homepage">worldcat.org</a>&rdquo; to drive traffic to local libraries, when it comes to sharing factual and non-factual data in surrogate records, being in the middle might not always be the most efficient way to make use of bibliographic data.  OhioLINK&rsquo;s efforts are based on a state mandate to be more efficient and effective for the users of higher education libraries in Ohio.  On balance, the rules of the cooperative cannot trump what might be in the best interests of the members.  Asserting the right to impose policy restrictions on records</p><p><h2>Post-panel Thoughts</h2><br />At the end of co-panelist Karen Calhoun&#8217;s remarks, she encouraged attendees to send comments to the Review Board of Shared Data Creation and Stewardship via the <a href="mailto:recorduse@oclc.org">recorduse@oclc.org</a> e-mail address.  I certainly encourage interested parties to do that, but to also find some way to post it in a public forum.  The <a href="http://wiki.code4lib.org/index.php/OCLC_Policy_Change" title="OCLC Policy Change - Code4Lib">discussion of the proposed policy</a> has been both spirited and informative.  Since this is a matter at the core of the cooperative, I don&#8217;t think the discussion should be limited to a one-way feed of information into the review board.  The discussion should also occur between us:  the members of the OCLC cooperative and community.  If you have a blog, post about it.  If not, consider <a href="http://lisnews.org/user/register" title="User account | LISNews">creating one at LISnews.org</a> and <span class="removed_link" title="http://lisnews.org/node/add/blog">post about it</span> there.  Or use mailing lists such as <a href="http://listserv.syr.edu/archives/autocat.html" title="Archives of AUTOCAT@LISTSERV.SYR.EDU">Autocat</a> and <a href="http://www.listserv.uga.edu/archives/radcat.html" title="Archives of RADCAT@LISTSERV.UGA.EDU">Radcat</a>.  OCLC already has a community forum platform &#8212; <a href="http://www.webjunction.org/home" title="WebJunction homepage">WebJunction</a> &#8212; and it would be good to see OCLC use that as a forum for public discussion.</p><p>Thanks to Charles Wilt, Executive Director of ALCTS, for inviting me to speak at the <a href="http://www.ala.org/ala/mgrps/divs/alcts/alcts.cfm" title="Association for Library Collections and Technical Services (ALCTS) homepage">ALCTS</a> Forum and to Karen Calhoun for facilitating the invitation.  My appreciation also goes out to my co-panelists:  Karen Calhoun (who has <a href="http://www.slideshare.net/amarintha/creating-and-sustaining-communities-around-shared-data-the-case-of-oclc-presentation" title="Creating and Sustaining Communities Around Shared Data: The Case of OCLC - SlideShare">posted her slides online</a>), <a href="http://everybodyslibraries.com/2009/01/28/open-catalog-apis-and-data-ala-presentation-notes-posted/" title="Open catalog APIs and data: ALA presentation notes posted &amp;laquo; Everybody&amp;#8217;s Libraries">John Mark Ockerbloom</a> (who also <a href="http://works.bepress.com/john_mark_ockerbloom/10/" title="Open records, open possibilities">posted his slides and approximate speech transcript</a>), and Brian Schottlaender (who eloquently summarized statements from the other panelists and took point in fielding questions from the audience).<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://lisnews.org/node/add/blog on January 13th, 2011.</p><div class='series_links'><a href='http://dltj.org/article/oclc-records-use-policy-1/' title='Consideration of OCLC Records Use Policy'>Previous in series</a> <a href='http://dltj.org/article/guardian-correction/' title='Correction Added to Guardian Story on OCLC Record Use Policy'>Next in series</a></div>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/oclc-records-use-policy-2/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Consideration of OCLC Records Use Policy</title><link>http://dltj.org/article/oclc-records-use-policy-1/</link> <comments>http://dltj.org/article/oclc-records-use-policy-1/#comments</comments> <pubDate>Wed, 28 Jan 2009 04:46:11 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[policy]]></category> <category><![CDATA[ALA Midwinter 2009]]></category> <category><![CDATA[Amazon]]></category> <category><![CDATA[Biblios]]></category> <category><![CDATA[copyright]]></category> <category><![CDATA[description]]></category> <category><![CDATA[Google Book Search]]></category> <category><![CDATA[LibLime]]></category> <category><![CDATA[MARC]]></category> <category><![CDATA[OCLC]]></category> <category><![CDATA[Open Library]]></category><guid isPermaLink="false">http://dltj.org/?p=694</guid> <description><![CDATA[At ALA Midwinter, ALCTS sponsored a panel discussion about sharing library-created data inside and outside the library community, with a particular focus on cataloging data. I was honored to be ask to speak on the topic from the perspective of &#8230; <a href="http://dltj.org/article/oclc-records-use-policy-1/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=694"></abbr><p>At ALA Midwinter, ALCTS sponsored a panel discussion about sharing library-created data inside and outside the library community, with a particular focus on cataloging data.  I was honored to be ask to speak on the topic from the perspective of a consortial office.  This is the first in a series of posts that represents an approximation of what I said on the panel.  (Also be sure to read the summary of the session by <a href="http://www.libraryjournal.com/article/CA6632413.html" title="OCLC Defends Records Policy, Faces Questions, Suggestions, and Criticisms">Norman Oder in Library Journal</a>.)</p><p>I think it is important to step back and reflect on the nature of what we are talking about.  We build bibliographic records as surrogates for the desired object, meaning that the surrogate is a means to an end – retrieving the described object – and not an end onto itself. We build indexes of these surrogates for patrons to use to discover information.  All other factors held constant, the better the surrogate, the greater the chance the user will find the information they are seeking.  The following discussion looks at the sources of records, the way they are built, and what it means to try to share them.</p><p><h2>Sources of Records</h2><br />The most familiar form of surrogates are the records generated by humans, and in our field <a href="http://www.aacr2.org/" title="AACR2 homepage">AACR2</a> encoded in <a href="http://www.loc.gov/marc/" title="MARC homepage">MARC</a> is the most common.  There are many sources of human-generated cataloging records.  For academic libraries, the most obvious is OCLC, but it is not the only one.  Integrated library systems include <a href="http://www.loc.gov/z3950/agency/" title="Z39.50 Maintenance Agency homepage">Z39.50</a> clients that enable the search of remote catalogs and import the resulting MARC records into the local catalog.  The Library of Congress catalog and the OhioLINK library catalog can be used in this fashion.  Records can also be purchased from vendors.  One emerging source is the recently announced ‘<a href="http://biblios.net/" title="biblios.net homepage">&Dagger;biblios.net</a>’ from <a href="http://liblime.com/" title="LibLime homepage">LibLime</a>.  The <a href="http://openlibrary.org/" title="Open Library homepage">Open Library project</a> of the Internet Archive could conceivably also be a source of cataloging records, although such use is not the goal of the project.</p><p>The humans creating these surrogate records are typically called “catalogers” although I’m coming to prefer the term descriptionists as a more accurate portrayal of their activity (if, for no other reason, that the description activity engaged in by these professionals extends beyond what we traditionally consider “the catalog”).  Using the tools of taxonomies and ontologies, the descriptionist creates the surrogate of the item to put into the catalog systems.</p><div id="attachment_google_tech_talk" class="wp-caption alignright" style="width: 314px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><embed id="VideoPlayback" FlashVars="initialTime=1765" src="http://video.google.com/googleplayer.swf?docid=2159021324062223592&#038;hl=en&#038;fs=true" style="width:300px;height:245px" allowFullScreen="true" allowScriptAccess="always" type="application/x-shockwave-flash"> </embed><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">Google Tech Talk by David Weinberger about his book 'Everything is Miscellaneous', starting at 29 minutes and 25 seconds into the talk</p></div><p>There is another way emerging, however, to create these surrogates:  through computer algorithmic computation against the object itself.  In his book <a href="http://www.everythingismiscellaneous.com/" title="Website for &#038;039;Everything is Miscellaneous&#038;039; book">Everything is Miscellaneous</a>, David Weinberger talks about the fungible nature of metadata and data: &ldquo;metadata&rdquo; is what we know and &ldquo;data&rdquo; is what we want to find out.  In a <a href="http://video.google.com/videoplay?docid=2159021324062223592" title="Everything is Miscellaneous">talk given at Google in May 2007</a>, he gave an example of using something known &mdash; like a quote from a book &mdash; to find out something &mdash; like the author of that book.  (Skip to 29 minutes and 25 seconds into the video.)  The &ldquo;metadata&rdquo; (the quote) was used to find the &ldquo;data&rdquo; (the author) that was being sought.</p><p>This is Google&rsquo;s big experiment in computational analytics.  It is relying on an analysis of the text of the item itself to be the descriptive surrogate.  It employs algorithms that look at every word &mdash; perhaps even every concept &mdash; in an item and weights it relative to those words and terms in other items in the corpus.</p><p>Amazon has done something similar for years with its &ldquo;<a href="http://www.amazon.com/gp/phrase/help/help.html" title="Amazon.com: What are Key Phrases?">Key Phrases</a>&rdquo; techniques.  For Amazon, Capitalized Phrases are people, places, events, or terms mentioned in a book. Statistically Improbable Phrases are the most distinctive phrases in a book as compared against the corpus of texts in its catalog.  These become a form of index points &ndash; the surrogates &ndash; for finding the item.</p><p>Of human-generated or machine-generated, which of these is better &ndash; a measure of effectiveness in precision and recall as well as an assessment of the relative cost to create &ndash; is undoubtedly a topic of research and debate.  For those of us more familiar in the ways of the descriptionists, however, it is undoubtedly time to become familiar with the ways of the computationalists to understand the strengths of each.</p><p><h2>Ownership of records</h2><br />Our surrogate records contain two varieties of data:  recitation of facts and efforts of creativity.  Under U.S. intellectual property law, facts are not creative works and therefore are not covered by copyright.  (This is the common interpretation of <a href="http://en.wikipedia.org/wiki/Feist_v._Rural" title="Feist Publications v. Rural Telephone Service - Wikipedia">Feist v. Rural</a>, a 1991 Supreme Court case that determined that telephone directories are not creative works and therefore are not offered copyright protection.)  The legal status of our surrogate records is somewhat murkier, though.  While the inclusion of facts such as title, author, and publisher are not creative acts, the assignment of classification numbers and subject headings could be a creative act, and the person doing the creation would hold copyright for those acts.</p><p>Ownership of the data in records is even cloudier than what is outlined above.  By my estimation, we are entering a world in which there are four types of attributes that make up our bibliographic records. The first is the recitation of the facts, whether copied from the item-in-hand or obtained as a feed of information from publishers.  The second is the creative work of the descriptionist in adding value to the surrogate in the form of classification numbers, subject headings, abstracts, and the like.  The third is the additions that the computationalists &ndash; or, more specifically, their algorithms &ndash; bring to the object&rsquo;s description in the surrogate.  This can take the form of the previous discussed Google Book Search algorithms as well as Amazon&rsquo;s Capitalized Phrases and Statistically Improbable Phrases.  It also includes actions specific to our traditional domain of human-generated surrogates; for instance, of the algorithms OCLC runs across the Worldcat dataset to merge records and improve records.  And finally, thinking beyond the boundaries typical of AACR2 and MARC, there is a fourth type: user-contributed information.  The creative efforts of users to add tags, reviews, summaries, and commentary add to the surrogate, and with that we kick wide open a door to the problem of who &ldquo;owns&rdquo; these surrogate records.</p><p>The issue of who &ldquo;owns&rdquo; the user-contributed information to records is handled in a wide variety of ways.  In LibraryThing, for instance, the<a href="http://www.librarything.com/privacy#terms" title="LibraryThing terms of use"> creator retains copyright</a> over tags, reviews, summaries, and comments, and grants LibraryThing non-exclusive rights to make use of that creative effort.  LibraryThing also has a function where users can add &ldquo;Common Knowledge&rdquo; (character names and occupations, locations, honors/awards, and quotations, among other details) about the item.  Users <a href="http://www.librarything.com/wiki/index.php/Common_Knowledge_License" title="Common Knowledge License">add Common Knowledge augmentations with a Creative Commons &ldquo;By, Share-Alike&rdquo; license</a>.  The <a href="http://openlibrary.org/about/license" title="Open Library project license">Open Library project asserts no rights</a> over the data in its system; it declares the material in the open library database to be facts, and therefore in the public domain (at least under U.S. copyright law).  The newly announced <a href="http://biblios.net/pddl" title="biblios.net license">&Dagger;biblios.net repository</a> of records uses the Open Data Commons Public Domain Dedication and License, developed by Talis in the UK and Creative Commons.  It, too, asserts that the subject data is in the public domain and offers suggested community norms surrounding the use of the data.</p><p>Ownership of records is also a continuing source of discussion among members of the OCLC cooperative.  If a brief record from a publisher is added into the WorldCat database and is subsequently enhanced by one or more members, who &ldquo;owns&rdquo; the record.  If a library systematically adds enhancements to records that matter to its own community &ndash; enhancements such as paper and binding types &ndash; who &ldquo;owns&rdquo; those records?</p><p>One way to handle this variety of who (if anyone) owns portions of the surrogate is to split records based on who contributed what.  This would enable consuming systems to make decisions on what data to include based on the conditions the owners might put on their creative acts.  (Keep in mind that legal precedent would seem to indicate that the facts expressed in the surrogate couldn&rsquo;t be protected by copyright restrictions.)  But our existing record structures do not give us a way to do this, nor does there seem to be any work going on to enable this to happen.  Perhaps we are all just burying our collective heads in the sand and hoping it will go away.</p><p><h2>More to come&#8230;</h2><br /><a href="http://dltj.org/article/oclc-records-use-policy-2/">Part 2</a> reflects on the implication of restrictive licenses, with a focus on the now-withdrawn OCLC records use policy, and a call for open discussion.<p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from https://biblios.net/ to http://biblios.net/ on February 11th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from https://biblios.net/pddl to http://biblios.net/pddl on February 11th, 2011.</p><div class='series_links'> <a href='http://dltj.org/article/oclc-records-use-policy-2/' title='Further Consideration of OCLC Records Use Policy'>Next in series</a></div>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/oclc-records-use-policy-1/feed/</wfw:commentRss> <slash:comments>8</slash:comments> </item> <item><title>Open Library Demonstration Screencast</title><link>http://dltj.org/article/open-library/</link> <comments>http://dltj.org/article/open-library/#comments</comments> <pubDate>Fri, 20 Jul 2007 14:05:12 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Disruption in Libraries]]></category> <category><![CDATA[description]]></category> <category><![CDATA[Internet Archive]]></category> <category><![CDATA[library 2.0]]></category> <category><![CDATA[ngc4lib]]></category> <category><![CDATA[Open Library]]></category> <category><![CDATA[screencast]]></category><guid isPermaLink="false">http://dltj.org/2007/07/open-library/</guid> <description><![CDATA[Earlier this week, Aaron Swartz of the Internet Archive <a href="http://www.aaronsw.com/weblog/openlibrary" title="Announcing the Open Library (Aaron Swartz&#039;s Raw Thought)">announced</a> the <a href="http://demo.openlibrary.org/" title="The Open Library demonstration site homepage">demonstration website of the Open Library project</a>, a new kind of book catalog that brings together traditional publisher and library bibliographic data in an interface with the user-contributed paradigm of Wikipedia.  Okay, I'll pause for a moment while you parse that last sentence.  Think you got it?  Read -- and watch -- further.Open Library has been <a href="http://www.librarything.com/thingology/2007/07/open-library.php" title="Open Library (Thingology - LibraryThing&#039;s ideas blog)">mentioned</a> a <a href="http://digitaleccentric.blogspot.com/2007/07/open-library.html" title="Open Library (Digital Eccentric blog)">bit</a> in the <a href="http://blogs.talis.com/panlibus/archives/2007/07/license_for_ope.php" title="License for Open Library? (panlibus blog)">blogs</a> <a href="http://www.libraryjournal.com/blog/1090000309/post/1800011980.html" title="The People&#039;s Catalog (Roy Tennant&#039;s blog)">this week</a>, but not to the extent I thought was worthy of the magnitude of the project.  So I recorded a screencast introduction (in Flash Video format below followed by a rough transcript) that looks at not only the browsing side of the system but also the record editing and record creation aspects of Open Library.  As I say at the end of the recording, Open Library is one of those mind-bending, assumption-shattering projects that, at least for me, is challenging my thoughts about what library service could be and should be.  Congratulations to the team at the Internet Archive, and I'm looking forward to future enhancements and directions for the project. <a href="http://dltj.org/article/open-library/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/07/open-library/"></abbr><p>Earlier this week, Aaron Swartz of the Internet Archive <a href="http://www.aaronsw.com/weblog/openlibrary" title="Announcing the Open Library (Aaron Swartz&#039;s Raw Thought)">announced</a> the <a href="http://demo.openlibrary.org/" title="The Open Library demonstration site homepage">demonstration website of the Open Library project</a>, a new kind of book catalog that brings together traditional publisher and library bibliographic data in an interface with the user-contributed paradigm of Wikipedia.  Okay, I&#8217;ll pause for a moment while you parse that last sentence.  Think you got it?  Read &#8212; and watch &#8212; further.</p><p>Open Library has been <a href="http://www.librarything.com/thingology/2007/07/open-library.php" title="Open Library (Thingology - LibraryThing&#039;s ideas blog)">mentioned</a> a <a href="http://digitaleccentric.blogspot.com/2007/07/open-library.html" title="Open Library (Digital Eccentric blog)">bit</a> in the <a href="http://blogs.talis.com/panlibus/archives/2007/07/license_for_ope.php" title="License for Open Library? (panlibus blog)" class="broken_link" rel="nofollow">blogs</a> <a href="http://www.libraryjournal.com/blog/1090000309/post/1800011980.html" title="The People&#039;s Catalog (Roy Tennant&#039;s blog)" class="broken_link" rel="nofollow">this week</a>, but not to the extent I thought was worthy of the magnitude of the project.  So I recorded a screencast introduction (in Flash Video format below followed by a rough transcript) that looks at not only the browsing side of the system but also the record editing and record creation aspects of Open Library.  As I say at the end of the recording, Open Library is one of those mind-bending, assumption-shattering projects that, at least for me, is challenging my thoughts about what library service could be and should be.  Congratulations to the team at the Internet Archive, and I&#8217;m looking forward to future enhancements and directions for the project.<br /><br /><object type="application/x-shockwave-flash" data="http://dltj.org/wp-content/plugins/pb-embedflash/swf/mediaplayer.swf?width=720&amp;height=500" width="720" height="500" class="embedflash"><param name="movie" value="http://dltj.org/wp-content/plugins/pb-embedflash/swf/mediaplayer.swf?width=720&amp;height=500" /><param name="allowfullscreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="flashvars" value="file=http://drc-dev.ohiolink.edu/presentations/open-library-screencast.flv&amp;searchbar=false" /><small>(Please open the article to see the flash file or player.)</small></object></p><p>Rough transcript of the screen cast is below.</p><p><h2>Introduction</h2></p><p>Hello, and welcome to this screencast overview of the <a href="http://openlibrary.org/" title="The Open Library homepage">Open Library project</a>.  Open Library is an effort by the Internet Archive to create a comprehensive catalog of every book.  As the <a href="http://demo.openlibrary.org/about" title="About Us<br /> (The Open Library)">project&#8217;s &#8220;about&#8221; page</a> says, &#8220;Not every book on sale, or every important book, or even every book in English, but simply every book.&#8221;  The about page goes on to describe the characteristics of Open Library project &#8212; that it is a project enabled by Internet technology because no physical space could hold it and that it aims to pull together records from publishers and libraries.  It is also a project in the same vein as Wikipedia, meaning that any user can create and edit the records in the system.</p><p>In this overview, I&#8217;ll lead you through searching and browsing the Open Library&#8217;s demonstration website from the perspective of any modern library catalog interface.  Then I&#8217;ll show you where it deviates from traditional library catalogs by exposing the underlying wiki nature of the database; we&#8217;ll examine the changes that users have made and we&#8217;ll even make a change ourselves.  And finally I&#8217;ll show the process of creating entirely new records in the system.  So let&#8217;s get started.</p><p><h2>Searching</h2></p><p>We&#8217;re looking at the <a href="http://demo.openlibrary.org/" title="The Open Library demonstration site homepage">home page of the Open Library project demonstration site</a>.  In the middle is a search box with a suggested search &#8212; &#8220;tom sawyer adventure&#8221;.  That is a good suggestion so we&#8217;ll click on Go.  Open Library returns <a href="http://demo.openlibrary.org/search?q=tom+sawyer+adventure" title="Search Results (The Open Library)">a classic, relevance ranked list of matching records</a> with some book covers along the left side and a faceted list of refinements along the right.  So right away you can see that there are some authority control problems here in the author names &#8212; Twain comma Mark, Mark comma Twain, and Twain comma Mark with birth and death dates &#8212; and here in the language field.  But I have high hopes that the developer team will find some intriguing ways to address these problems.</p><p>Back over here in the results area we have the various editions of Samuel Clemen&#8217;s &#8220;The Adventures of Tom Sawyer&#8221; &#8212; let&#8217;s pick <span class="removed_link" title="http://demo.openlibrary.org/b/adventures_of_Tom_Sawyer">the 1876 edition to see the full record display</span> &#8212; there.  We have the publisher, publication date and place, language, and a summary or review of sorts at the bottom.  We also see signs of the availability of full text &#8212; over here in the options box there is a <a href="http://openlibrary.org/details/adventuresoftoms00twaiuoft" title="Open Library: Details: The adventures of Tom Sawyer">download from the Internet Archive link</a>, a &#8220;Scan Sponsor&#8221; field here and a &#8220;View this book&#8221; graphic.  This is one of the items scanned by the Open Content Alliance and made available by the Internet Archive through the Open Library project.  A very nice interface for paging through the book.  So one could imagine that the Open Library could become the primary vehicle by which Open Content Alliance materials are made available to the public.</p><p>So let&#8217;s go back here to the metadata page.  Remember in the introduction that I said that the data was malleable in a wiki-like fashion.  The Open Library developers created a system that allows for user-contributed updates (a la Wikipedia) to fielded data (like your classic bibliographic record).  The two hints that the record is modifiable are this big edit button in the middle of the metadata and this more subtile <span class="removed_link" title="http://demo.openlibrary.org/b/adventures_of_Tom_Sawyer?m=history">&#8220;[history]&#8221; link</span> near the top of the page.  Let&#8217;s start with the history link to see what has been done to this record.</p><p>This page should look familiar to those who have worked with wikis before.  It shows a listing of edits that were made to this record from most recent to the very first edit, who made the change (identified by IP addresses in this case because the people making the changes were not logged to an account at the time), an editor-supplied comment about what was done, and when the change was made.  We can go back in time and see the page at a particular version through the links under the &#8220;When&#8221; column, or we can use the compare function to see the difference between two version.  In the case of <span class="removed_link" title="http://demo.openlibrary.org/b/adventures_of_Tom_Sawyer?b=3&amp;a=2&amp;m=diff">the changes between version 2 and version 3</span>, we see that the editor added &#8220;Canada&#8221; as the place of publication.  On this page you start to see the fielded nature of this wiki structure, but the best place to see it is look at the record edit screen itself.</p><p>These are all full-text fields on this page with no controlled vocabulary.  You&#8217;ll note the absence of any MARC field names here, but as you scroll through you&#8217;ll see the evidence of MARC and AACR2 in the field labels.  Down at the bottom is an edit summary to describe the changes made to the record, then save, preview and delete version buttons &#8212; all classic wiki functions.</p><p><h2>Editing</h2></p><p>Now, I&#8217;d like to show the full record editing process, but since I don&#8217;t have this Mark Twain book in hand, I&#8217;m going to bring up another record that I created yesterday &#8212; &#8220;<span class="removed_link" title="http://demo.openlibrary.org/b/Eric_Meyer_on_CSS">Eric Meyer on CSS</span>&#8220;.  Before showing the editing process, let&#8217;s linger here a moment at the &#8220;options&#8221; box along the right side.  Since this is a more modern book (as opposed to the Tom Sawyer book we saw first), there are additional options here for purchasing the book through these various vendors or borrowing the book through a very nice link into Open Worldcat and two web-based book trading sites.</p><p>But back to the metadata.  There is one error and one omission in this record &#8212; perhaps this is a subtile demonstration of problems that creep in with user-generated content.  First, the error, is that there is an extra digit in the ISBN-10 field, which is a big problem because the links in the options box use the ISBN as a linking field and at the time of this recording they don&#8217;t work.  They will work in a moment, though.  The second problem is that I forgot to put in the publication date.  But hey, no problem, all I need to do is &#8220;Edit&#8221; this record.</p><p>So we are back to <a href="http://demo.openlibrary.org/b/Eric_Meyer_on_CSS?m=edit" title="edit Eric Meyer on CSS : Mastering the Language of Web Design (The Open Library)">the edit screen</a>, and I&#8217;m going to scroll down and fix the ISBN-10 field like so, then scroll down a little further and add the publication date.  Then I&#8217;ll scroll all the way to the bottom and type in an edit summary &#8212; &#8220;Fixed the ISBN and added a publication date&#8221; &#8212; and hit save.  We&#8217;re now back at the metadata display screen and <a href="http://worldcat.org/isbn/0-73571-245-x" title="Eric Meyer on CSS : mastering the language of Web design [WorldCat.org]">the link to Open Worldcat</a> now works.  So, as an aside, one wonders what the folks in Dublin, Ohio, think about this.  It is competition on the one hand since Worldcat is also aiming to be the most comprehensive catalog of books in the world.  On the other hand, perhaps there is room for cooperation by somewhat getting vetted changes to Open Library records into the OCLC union catalog.  Who knows?</p><p><h2>Creating a New Record</h2></p><p>Alright, back to current reality.  Let&#8217;s add a record to Open Library, and in this case I&#8217;m going to use an ARL SPEC Kit that I wrote a number of years ago called &#8220;Library Patron Privacy&#8221;.  First let&#8217;s run a search in Open Library to see if it is there, and no, it isn&#8217;t.  The only way I&#8217;ve figured out how to enter a new item is to go to the URL where the page would be located and get the classic wiki &#8220;This page does not exist. Create it?&#8221; message.</p><p>One of the quirks I found in the system is that I have to create author wiki pages before book wiki pages &#8212; otherwise I&#8217;ll get a Python error message on the screen.  I&#8217;ve reported this to the Open Library developers, but in the meantime just know authors need to be created before their books.  Which is to say that authors have wiki pages in Open Library in addition to books.  The structure of URLs to Open Library author pages is the letter &#8220;a&#8221; followed by a slash followed by the author&#8217;s last, first and middle names separated by underscore characters.  So I&#8217;ll go to the URL of that form, then click on the &#8220;Create it&#8221; link.</p><p>Now here is one of the tricky parts of the existing interface.  The page type starts as &#8220;type/page&#8221;, and as you can see it doesn&#8217;t have any of the fielded elements that we saw in previous examples.  What you have do do is change the page type to &#8220;type/author&#8221; and then you get the fielded HTML form.  So I&#8217;m going to go through here and fill in some of the parts.  Then go down to the edit summary field and write a summary of this change, then click save.  Now that <span class="removed_link" title="http://demo.openlibrary.org/a/Murray_Peter_E">Open Library knows who I am</span>, let&#8217;s create the record for the book.</p><p>You&#8217;ve seen the structure of the URLs to book pages before &#8212; a &#8220;b&#8221; followed by a slash followed by the book title with spaces replaced by underscore characters.  I&#8217;ll put that in the URL field and get the default page type.  This needs to be changed to &#8220;type/edition&#8221; in order to get the bibliographic record fields.  There.  Now I&#8217;ll go through here and enter the data.  When we get down to the author field we enter it in the same format that we used to create it &#8212; an &#8220;a&#8221; followed by a slash followed by the name with spaces replaced by underscores.</p><p>So we&#8217;ll just finish up here and come down to the edit summary field, put something in here, and hit save. <span class="removed_link" title="http://demo.openlibrary.org/b/Library_Patron_Privacy">This record</span> is now in the system, and you can see the public display here along with the links on the right because I entered an ISBN.  I haven&#8217;t quite figured out how to get a cover image into the system yet &#8212; I expect there is a file upload interface somewhere, but I haven&#8217;t found it.</p><p><h2>Conclusions</h2></p><p>So that&#8217;s all there is, and I don&#8217;t say that in a way to denigrate the work that has been done by the development team so far.  As the URL and site banner indicate, it is a demonstration system &#8212; and a compelling demonstration it is.  All sorts of questions immediately come to mind, of course &#8212; will there be a controlled vocabulary or authority control built into the system, can data be exported out of records &#8212; and, for that matter, can end-users bulk import data into the system, are there Web2.0 niceties like tagging and RSS feeds in the works, and so forth.</p><p>Even with all of those questions, Open Library is one of those mind-bending, assumption-shattering projects that, at least for me, is challenging my thoughts about what library service could be and should be.  Congratulations to the team at the Internet Archive, and I&#8217;m looking forward to future enhancements and directions for the project.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://demo.openlibrary.org/b/adventures_of_Tom_Sawyer on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://demo.openlibrary.org/b/adventures_of_Tom_Sawyer?m=history on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://demo.openlibrary.org/b/adventures_of_Tom_Sawyer?b=3&#038;a=2&#038;m=diff on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://demo.openlibrary.org/b/Eric_Meyer_on_CSS on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://demo.openlibrary.org/a/Murray_Peter_E on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://demo.openlibrary.org/b/Library_Patron_Privacy on January 19th, 2011.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/open-library/feed/</wfw:commentRss> <slash:comments>9</slash:comments> <enclosure url="http://drc-dev.ohiolink.edu/presentations/open-library-screencast.flv" length="0" type="video/x-flv" /> </item> </channel> </rss>
<!-- Served from: dltj.org @ 2012-02-11 12:09:48 by W3 Total Cache -->
