<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"><channel><title>Disruptive Library Technology Jester &#187; Google Scholar</title> <atom:link href="http://dltj.org/tag/google-scholar/feed/" rel="self" type="application/rss+xml" /><link>http://dltj.org</link> <description>We&#039;re Disrupted, We&#039;re Librarians, and We&#039;re Not Going to Take It Anymore</description> <lastBuildDate>Mon, 06 Feb 2012 20:04:22 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <cloud domain='dltj.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' /> <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/3.0/us/</creativeCommons:license> <item><title>Thursday Threads: HarperCollins/OverDrive (still), Wikimedia Survey, Microsoft Academic Search</title><link>http://dltj.org/article/thursday-threads-2011w14/</link> <comments>http://dltj.org/article/thursday-threads-2011w14/#comments</comments> <pubDate>Thu, 07 Apr 2011 10:10:52 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Thursday Threads]]></category> <category><![CDATA[Google Scholar]]></category> <category><![CDATA[HarperCollins-OverDrive controversy]]></category> <category><![CDATA[Microsoft Academic Search]]></category> <category><![CDATA[Wikipedia]]></category><guid isPermaLink="false">http://dltj.org/?p=2784</guid> <description><![CDATA[Receive DLTJ Thursday Threads:by&#160;E-mailby&#160;RSSDelivered by FeedBurner We can&#8217;t leave the hot topic of ebooks behind in this edition of DLTJ Thursday Threads, but at least it is only the lead thread and not the entire focus of this post. HarperCollins &#8230; <a href="http://dltj.org/article/thursday-threads-2011w14/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=2784"></abbr><div id="feedburner-thursday-threads-email-2011w14" class="wp-caption alignright noprint noFrontPage" style="width: 230px;;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><form style="border: 1px solid rgb(204, 204, 204); padding: 3px; margin: 0pt; text-align: center;" action="http://feedburner.google.com/fb/a/mailverify" method="post" target="popupwindow" onsubmit="window.open('http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads', 'popupwindow', 'scrollbars=yes,width=550,height=520');return true"><p>Receive <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym></i> Thursday Threads:</p><p>by&nbsp;<a href="http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads&amp;loc=en_US" title="D.L.T.J. Thursday Threads Email Subscription">E-mail</a><br /><input style="width: 140px;" name="email" value="Your e-mail address" onfocus="if (this.defaultValue==this.value) this.value = ''" type="text"/><input value="thursday-threads" name="uri" type="hidden"/><input name="loc" value="en_US" type="hidden"/><input value="Subscribe" type="submit"/></p><p>by&nbsp;<a href="http://feeds.dltj.org/thursday-threads/" title="D.L.T.J. Thursday Threads RSS Feed">RSS</a></p><p style="font-size: 80%;">Delivered by <a href="http://feedburner.google.com" target="_blank" title="Google Feedburner Service">FeedBurner</a></p></form></div><p> We can&#8217;t leave the hot topic of ebooks behind in this edition of <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym> Thursday Threads</i>, but at least it is only the lead thread and not the entire focus of this post.  HarperCollins made news when one of its <a href="#p2784-hcod">executives appeared at a symposium in Connecticut</a> and said that the new digital circulation policy was a &#8220;work in progress&#8221;.  Leaving that aside, Wikimedia is seeking responses to a <a href="#p2784-wikipedia">survey to find out what barriers exist to expert contributions</a>.  Lastly is a call to <a href="#p2784-microsoft-academic-search">keep Microsoft Research&#8217;s Academic Search on your radar screen</a>; some interesting updates are coming out that rival Google Scholar and perhaps even some subscription services.</p><p>Feel free to send this to others you think might be interested in the topics.  If you find these threads interesting and useful, you might want to add the <a href="http://feeds.dltj.org/thursday-threads/" title="RSS Feed for DLTJ Thursday Threads">Thursday Threads RSS Feed</a> to your feed reader or subscribe to e-mail delivery using the form to the right.  If you would like a more raw and immediate version of these types of stories, watch <a href="http://friendfeed.com/dltj" title="Peter Murray - FriendFeed">my FriendFeed stream</a> (or subscribe to <a href="http://friendfeed.com/dltj?format=atom" title="Atom feed for Peter Murray's FriendFeed account">its feed</a> in your feed reader).  Comments and tips, as always, are <a href="http://dltj.org/contact">welcome</a>.</p><p><h2 id="p2784-hcod">HarperCollins Executive Calls Circulation Cap a &#8220;Work in Progress&#8221;</h2></p><blockquote><p>HarperCollins knew that <a href="http://www.libraryjournal.com/lj/home/889452-264/harpercollins_caps_loans_on_ebook.html.csp" title="HarperCollins Puts 26 Loan Cap on Ebook Circulations | Library Journal">its decision to cap ebook circulations at 26</a> would generate some heat, but the intensity, nonetheless, surprised the company.</p><p>&#8220;We certainly expected a variety of responses, and we knew that there would be a lot of people that had issues,&#8221; Josh Marwell, president of sales, told <em>LJ</em> after speaking to some 150 librarians gathered Tuesday at the <a href="http://www.darienlibrary.org/" title="Darien Library">Darien Library</a>, CT, as part of &#8220;<a href="https://m360.ctlibrarians.org/event.aspx?eventID=27527&amp;instance=0">eBooks: Collections at the Crossroads</a>,&#8221; a symposium organized by the Connecticut Library Consortium (#clctrendspotting, #clcebooks). &#8220;I think what was surprising was the intensity and how widespread it was. There were a lot of people who are not carrying ebooks now who entered into the fray,&#8221; he said.</p><div style="text-align: right; width: 100%;"><cite>- <a href="http://www.libraryjournal.com/lj/home/890077-264/harpercollins_executive_calls_circulation_cap.html.csp" title="HarperCollins Executive Calls Circulation Cap a &#038;039;Work in Progress&#038;039; | Library Journal">HarperCollins Executive Calls Circulation Cap a &#8220;Work in Progress&#8221;</a>, Michael Kelley, Library Journal</cite></div></blockquote><p>In the first public statement from HarperCollins since their <a href="http://harperlibrary.typepad.com/my_weblog/2011/03/open-letter-to-librarians.html" title="Open Letter to Librarians | Library Love Fest">open letter to librarians</a> was published over a month ago, <a href="http://www.harpercollins.com/footer/release.aspx?id=235&amp;b=&amp;year=2004" title="Corporate Press Releases, HarperCollins Publishers">president of sales Josh Marwell</a> went on to say, &#8220;Is 26 set in stone? No. It&#8217;s our number for now, but we want to hear back. Immediately. Honestly, it doesn&#8217;t make sense that one size fits all. We consider it a work in progress. But this is the number that we have now.&#8221;  Unfortunately there doesn&#8217;t seem to be a recording of Mr. Marwell&#8217;s remarks or the resulting discussion.  It would appear from the little bits in the Library Journal article and what can be found on <a href="http://search.twitter.com/search?q=+%23clcebks+since%3A2011-04-04+until%3A2011-04-06" title="Twitter search for #clcebks since:2011-04-04 until:2011-04-06">Twitter</a> is that HarperCollins is not considering a fundamental change to the new policy of limited numbers of uses (otherwise known as digital &#8220;checkouts&#8221;); what can be considered is the actual number.  That seems like a non-starter to me as a negotiating point.  If HarperCollins really wants a dialog on the matter, we need to step back and look at the whole model.  Or several models &#8212; one for front-list blockbusters, one for long-tail titles, distinctions between public and academic users, etc.</p><p>On a related note &#8212; related only in that HarperCollins and OverDrive are forever bound together in the &#8220;hcod&#8221; hashtag abbreviation (and to give equal time to both parties in this thread) &#8212; Joe Atzberger does a pretty good job <a href="http://libraryhacker.org/2011/04/04/underdone-autopsy-of-an-overdrive-eula/" title="Underdone: Autopsy of an OverDrive EULA | Library Hackers Unite!">dissecting</a> the OverDrive End User License Agreement (EULA).  (An <a href="http://libraryhacker.org/2011/04/04/executive-summary-autopsy-of-an-overdrive-eula/" title="Executive Summary: Autopsy of an OverDrive EULA  | Library Hackers Unite!">executive summary</a> is available.)  He says, &#8220;The OMC [OverDrive Media Console] EULA is the product of obvious cut-and-paste composition and questionable original language.  It repeats and contradicts itself and contains nonsensical references.  More seriously, it disqualifies OMC from all pertinent uses, levies prohibitions against libraries specifically, attempts to obligate the user to illegal or impractical conditions, and indicates an unlicensed open-source dependency.&#8221;</p><p><h2 id="p2784-wikipedia">Wikipedia Surveys Subject Experts to Find Barriers to Contributions</h2></p><blockquote><p>Wikipedia is now widely regarded as a mature project and is consulted by a large fraction of internet users, including academics and other experts. However, many of them are still reluctant to contribute to it. The aim of this survey is to understand why <b>scientists</b>, <b>academics</b> and other <b>experts</b>&nbsp;do (or do not) contribute to an open collaborative project such as Wikipedia, and whether individual motivation aligns with shared perceptions of Wikipedia within expert communities. We hope this may help us identify ways around barriers to expert participation.</p><p>The survey is anonymous and should take about 10 min to complete. It consists of a short introduction, followed by two main sections in which we contrast shared perceptions and personal motivation, and a final section where you can tell us more about yourself. At the end of the survey, you will find a link to follow the results and the ensuing conversation.</p><div style="text-align: right; width: 100%;"><cite>- <a href="http://survey.nitens.org/?sid=21693" title="SURVEY: Expert barriers to Wikipedia">Expert barriers to Wikipedia</a></cite></div></blockquote><p>This survey from the <a href="http://meta.wikimedia.org/wiki/Research_Committee" title="Research Committee | Wikimedia">Wikimedia Research Committee</a> is asking subject experts why they don&#8217;t contribute to Wikipedia.  I think librarians of all types (public, academic, special, etc.) certainly count among the target audience, so I think it is appropriate for this survey to get some traction in our community.  Questions include a section about how participating in Wikipedia editing is perceived by colleagues and another about motivations to contribute.  It did take about 10 minutes to complete.  [Survey found via George Siemens on Twitter]</p><p>On the other hand is an <a href="http://www.insidehighered.com/news/2011/04/052/college_libraries_use_wikipedia_to_increase_exposure_of_their_collections" title="Wielding Wikipedia -|Inside Higher Ed">article from Inside Higher Ed</a> summarizing a <a href="http://www.goeshow.com/acrl/national/2011/profile.cfm?profile_name=session&amp;master_key=7C4FEBE7-D609-A648-274F-D58211D22CF2&amp;page_key=558E302F-DDE0-40D3-8D24-C1CC650F9288&amp;xtemplate&amp;userLGNKEY=0" title="Wikipedia Lover, Not a Hater: Harnessing Wikipedia to Increase the Discoverability of Library Resources">session at ACRL</a> where librarians at the University of Houston are actively contributing image content to <a href="http://en.wikipedia.org/wiki/Wikimedia_commons" title="Wikimedia Commons | Wikipedia">Wikimedia Commons</a>, Wikipedia&#8217;s media library (<a href="http://www.goeshow.com/acrl/national/2011/client_uploads/handouts/ACRL%202011--HANDOUT--Elder,%20Reilly,%20Westbrook1.pdf" title="Presentation handout">handout</a>, <a href="http://www.goeshow.com/acrl/national/2011/client_uploads/handouts/ACRL%202011--PRESENTATION--Elder,%20Reilly,%20Westbrook1.pptx" title="Presentation slides">presentation slides</a>).  The comments to the article are almost more instructive than the summary of the presentation, with a lot of back-and-forth about the ethics of promoting library materials in Wikipedia in this manner.</p><p><h2 id="p2784-microsoft-academic-search">Microsoft Academic Search (Beta) Appears Ready to Expand Database Coverage</h2></p><blockquote><p>Look out Google Scholar. Get ready for the academic/scholarly search war to begin very very soon. We think many info industry database providers will also have an interest in a new service from Microsoft.</p><p>Here’s why.</p><p>Since October 2009 we’ve been covering, speaking about and paying very close attention to <a href="http://academic.research.microsoft.com/" title="Microsoft Academic Search">Microsoft Academic Search.</a> The product is being developed by Microsoft Research primarily by their team in Asia. It’s important to note that MS Academic is nothing like the mediocre (and that’s being kind) MS Live Academic Search available a few years ago.</p><p>In our view, <a href="http://academic.research.microsoft.com/" title="Microsoft Academic Search">Microsoft Academic Search</a> turns the open web academic/scholarly material search game up to 11.</p><div style="text-align: right; width: 100%;"><cite>- <a href="http://infodocket.com/2011/03/31/microsoft-academic-search-beta-appears-ready-to-expand-into-many-domains-of-knowledge/" title="Microsoft Academic Search (Beta) Appears Ready to Expand Database Coverage | INFOdocket">Microsoft Academic Search (Beta) Appears Ready to Expand Database Coverage</a>, Gary Price, INFOdocket</cite></div></blockquote><p>This is a resource to watch &#8212; it does appear that <a href="http://academic.research.microsoft.com/" title="Microsoft Academic Search">Microsoft Academic Search</a>, a project of <a href="http://research.microsoft.com/en-us/" title="Microsoft Research homepage">Microsoft Research</a>, is gearing up.  With more features than <a href="http://scholar.google.com/" title="Google Scholar homepage">Google Scholar</a> and an underlying database that appears to be catching up in size and scope, it will be interesting to see if a feature and content war break out between Microsoft and Google.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/thursday-threads-2011w14/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> <item><title>Thursday Threads: Amazon Pressures Publishers, Academic Spam, Mechanical Turk Spam, Multispectral Imaging</title><link>http://dltj.org/article/thursday-threads-2010w52/</link> <comments>http://dltj.org/article/thursday-threads-2010w52/#comments</comments> <pubDate>Thu, 30 Dec 2010 12:07:28 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Thursday Threads]]></category> <category><![CDATA[Amazon]]></category> <category><![CDATA[Amazon Mechanical Turk]]></category> <category><![CDATA[digitization]]></category> <category><![CDATA[Google Scholar]]></category> <category><![CDATA[jpeg2000]]></category> <category><![CDATA[preservation]]></category> <category><![CDATA[publishing]]></category> <category><![CDATA[search engine]]></category> <category><![CDATA[spam]]></category><guid isPermaLink="false">http://dltj.org/?p=1931</guid> <description><![CDATA[Receive DLTJ Thursday Threads:by&#160;E-mailby&#160;RSSDelivered by FeedBurner With the close of the year approaching, this issue marks the 14th week of DLTJ Thursday Threads. This issue has a publisher&#8217;s view of Amazon&#8217;s strong-arm tactics in book pricing, research into the possibility &#8230; <a href="http://dltj.org/article/thursday-threads-2010w52/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=1931"></abbr><div id="feedburner-thursday-threads-email-w52" class="wp-caption alignright" style="width: 230px;;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><form style="border:1px solid #ccc;padding:3px;margin:0;text-align:center;" action="http://feedburner.google.com/fb/a/mailverify" method="post" target="popupwindow" onsubmit="window.open('http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads', 'popupwindow', 'scrollbars=yes,width=550,height=520');return true"><p>Receive <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym></i> Thursday Threads:</p><p>by&nbsp;<a href="http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads&#038;loc=en_US" title="D.L.T.J. Thursday Threads Email Subscription">E-mail</a><br /><input type="text" style="width:140px" name="email" value="Your e-mail address" onFocus="if (this.defaultValue==this.value) this.value = ''"/><input type="hidden" value="thursday-threads" name="uri"/><input type="hidden" name="loc" value="en_US"/><input type="submit" value="Subscribe" /></p><p>by&nbsp;<a href="http://feeds.dltj.org/thursday-threads/" title="D.L.T.J. Thursday Threads RSS Feed">RSS</a></p><p style="font-size: 80%">Delivered by <a href="http://feedburner.google.com" target="_blank" title="Google Feedburner Service">FeedBurner</a></p></form></div><p> With the close of the year approaching, this issue marks the 14th week of <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym> Thursday Threads</i>.  This issue has a publisher&#8217;s view of Amazon&#8217;s strong-arm tactics in book pricing, research into the possibility that academic authors could game Google Scholar with spam, demonstrations of how Amazon&#8217;s Mechanical Turk drives down the cost of enlisting humans to overwhelm anti-spam systems, and a story of multispectral imaging adding information in the process of digital preservation.</p><p>As the new year approaches, I wish you the best professionally and personally.</p><p><h2><a name="books_after_amazon">Books After Amazon</a></h2></p><blockquote><p>What happens when an industry concerned with the production of culture is beholden to a company with the sole goal of underselling competitors? Amazon is indisputably the king of books, but the issue remains, as Charlie Winton, CEO of the independent publisher Counterpoint Press puts it, “what kind of king they’re going to be.” A vital publishing industry must be able take chances with new authors and with books that don’t have obvious mass-market appeal. When mega-retailers have all the power in the industry, consumers benefit from low prices, but the effect on the future of literature—on what books can be published successfully—is far more in doubt.</p></blockquote><p><a href="http://www.bostonreview.net/BR35.6/roychoudhuri.php" title="Boston Review &amp;mdash; Onnesha Roychoudhuri: Books After Amazon">Onnesha Roychoudhuri publishes this view of Amazon&#8217;s marketing practices</a> in the lastest issue of the <a href="http://www.bostonreview.net/" title="Boston Review &amp;mdash; Home">Boston Review</a>.  From the publisher&#8217;s pespective, the strong-arm tactics described sound horrible.  But the story also points to cracks appearing &#8212; at least for the bigger publishers.  That may leave smaller, independent publishers in a big squeeze.  [Via OCLC Research's <a href="http://www.oclc.org/research/publications/newsletters/abovethefold/2010-12-17.htm" title="http://www.oclc.org/research/publications/newsletters/abovethefold/2010-12-17.htm">Above-the-Fold</a>]</p><p><h2><a name="academic_spam">Academic Search Engine Spam and Google Scholar&#8217;s Resilience Against it</a></h2></p><blockquote><p>Abstract: In a previous paper we provided guidelines for scholars on optimizing research articles for academic search engines such as Google Scholar. Feedback in the academic community to these guidelines was diverse. Some were concerned researchers could use our guidelines to manipulate rankings of scientific articles and promote what we call ‘academic search engine spam’. To find out whether these concerns are justified, we conducted several tests on Google Scholar. The results show that academic search engine spam is indeed—and with little effort—possible: We increased rankings of academic articles on Google Scholar by manipulating their citation counts; Google Scholar indexed invisible text we added to some articles, making papers appear for keyword searches the articles were not relevant for; Google Scholar indexed some nonsensical articles we randomly created with the paper generator SciGen; and Google Scholar linked to manipulated versions of research papers that contained a Viagra advertisement. At the end of this paper, we discuss whether academic search engine spam could become a serious threat to Web-based academic search engines.</p></blockquote><p><a href="http://quod.lib.umich.edu/cgi/t/text/text-idx?c=jep;view=text;rgn=main;idno=3336451.0013.305" title="Academic Search Engine Spam and Google Scholar's Resilience Against it">Joeran Beel and Bela Gipp have this article</a> in the most recent issue of <a href="http://www.journalofelectronicpublishing.org/" title="The Journal of Electronic Publishing: Welcome">Journal of Electronic Publishing</a>.  In addition to being able to game <a href="http://scholar.google.com/" title="Google Scholar">Google Scholar</a>, the authors note that <a href="http://academic.research.microsoft.com/" title="Microsoft Academic Search">Microsoft Academic Search</a> and <a href="http://citeseer.ist.psu.edu/" title="CiteSeerX">CiteSeer</a> (as well as their own academic search engine currently under development &#8212; <a href="http://SciPlore.org/" title="SciPlore: Exploring Science">SciPlore</a>) have the same issues.  Although it is possible, we don&#8217;t know if it is being done &#8212; or even if there would be an penalties in the academic community for doing so.</p><p><h2><a name="mechanical_turk_spam">Mechanical Turk: Now with 40.92% spam</a></h2></p><blockquote><p>At this point, Amazon Mechanical Turk has reached the mainstream. Pretty much everyone knows about the concept. Post small tasks online, pay people cents, and get thousands of micro-tasks completed. Unfortunately, this resulted in some unfortunate trends. Anyone who frequents just a little bit the market will notice the tremendous number of spammy HITs. (HIT = a task posted for completion in the market; stands for Human Intelligence Task). &#8220;Test if the ads in my website work&#8221;. &#8220;Create a Twitter account and follow me&#8221;. &#8220;Like my YouTube video&#8221;. &#8220;Download this app&#8221;. &#8220;Write a positive review on Yelp&#8221;. A seemingly endless amount of spam HITs come to the market, mainly with the purpose of spamming &#8220;social media&#8221; metrics. So, with Dahn Tamir and Priya Kanth (MS student at NYU), we decided to examine how big is the problem. How many spammers join the market? How many spam HITs are there?</p></blockquote><p>This post from Panos Ipeirotis, Associate Professor at the IOMS Department at Stern School of Business of New York University, describes a <a href="http://behind-the-enemy-lines.blogspot.com/2010/12/mechanical-turk-now-with-4092-spam.html" title="Mechanical Turk: Now with 40.92% spam. - A Computer Scientist in a Business School">review of activities</a> posted to <a href="https://www.mturk.com/mturk/welcome">Amazon&#8217;s Mechanical Turk</a> service.  Spam is everywhere, and it appears that the Mechanical Turk is reducing the friction between buyers and workers of spam activity. [Via Ron Murray]</p><p><h2><a name="multispectral_imaging">Cutting-Edge Imaging Helps Scholar Reveal 8th-Century Manuscript</a></h2></p><blockquote><p>With a manuscript like the St. Chad Gospels, multispectral imaging—a series of scans, each based on a single part of the color spectrum—allows his team to create images that have the equivalent of three-dimensional detail, down to revealing the thickness of brush strokes on letters and illustrations. Cockled pages can be virtually flattened out so that all their details can be studied. Studied color band by color band, the chemical composition of ink can be determined.</p></blockquote><p>This <a href="http://chronicle.com/article/Cutting-Edge-Imaging-Helps/125616/" title="Cutting-Edge Imaging Helps Scholar Reveal 8th-Century Manuscript - Research - The Chronicle of Higher Education">article</a> by Jennifer Howard at the Chrnoicle of Higher Education reviews the story of how 8th-century documents in England were digitized by scholars at the University of Kentucky.  It caught my eye because of the mention of multispectral imaging; this is something that the JPEG2000 file format can natively store.  Digitization at this level doesn&#8217;t just provide alternative, online access to documents &#8212; it actually adds new information to the process of researching those documents.  [Note: the link is behind a publisher paywall. If you would like to see it, send me an e-mail and I'll forward you a short-term link from the Chronicle's website.]</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/thursday-threads-2010w52/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> <item><title>Thursday Threads: Google Scholar Coverage, Effective Meetings, Librarians as Obstacles, Cable TV</title><link>http://dltj.org/article/thursday-threads-2010w47/</link> <comments>http://dltj.org/article/thursday-threads-2010w47/#comments</comments> <pubDate>Thu, 25 Nov 2010 16:43:03 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Thursday Threads]]></category> <category><![CDATA[cable tv]]></category> <category><![CDATA[Google Scholar]]></category> <category><![CDATA[index/abstract database]]></category> <category><![CDATA[librarian profession]]></category> <category><![CDATA[open educational resources]]></category><guid isPermaLink="false">http://dltj.org/?p=1869</guid> <description><![CDATA[Receive DLTJ Thursday Threads by E-mail!Delivered by FeedBurner No, I am not composing this edition of DLTJ Thursday Threads on the Thanksgiving holiday. This was written the day before and scheduled for posting on Thursday. With a significant run of &#8230; <a href="http://dltj.org/article/thursday-threads-2010w47/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=1869"></abbr><div id="feedburner-thursday-threads-email-w47" class="wp-caption alignright" style="width: 250px;;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><form style="border:1px solid #ccc;padding:3px;text-align:center;" action="http://feedburner.google.com/fb/a/mailverify" method="post" target="popupwindow" onsubmit="window.open('http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads', 'popupwindow', 'scrollbars=yes,width=550,height=520');return true"><p><a href="http://feedburner.google.com/fb/a/mailverify?uri=thursday-threads&#038;loc=en_US" title="FeedBurner Email Subscription">Receive <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym></i> Thursday Threads by E-mail!</a></p><input type="text" style="width:140px" name="email" value="Your e-mail address" onFocus="if (this.defaultValue==this.value) this.value = ''"/><input type="hidden" value="thursday-threads" name="uri"/><input type="hidden" name="loc" value="en_US"/><input type="submit" value="Subscribe" /><p style="font-size: 80%">Delivered by <a href="http://feedburner.google.com" target="_blank" title="Google Feedburner Service">FeedBurner</a></p></form></div><p> No, I am not composing this edition of <i><acronym title="Disruptive Library Technology Jester">DLTJ</acronym></i> Thursday Threads on the Thanksgiving holiday.  This was written the day before and scheduled for posting on Thursday.  With a significant run of <a href="http://dltj.org/category/thursday-threads/">weekly Thursday Threads postings</a>, it seemed a shame to break the trend because of a holiday.  So if it is Thanksgiving Thursday (in the U.S.) and you are looking for something to read, how about an article questioning the need for index and abstract databases in light of Google Scholar?  Or tips for post-holiday effective meetings?  Or how librarians are viewed as obstacles to effective open educational resources?  Or simply be thankful that you are not in the cable TV operator business.<br /><span id="more-1869"></span><br /><h2><a name="google_scholar">Google Scholar&#8217;s Dramatic Coverage Improvement Five Years after Debut</a></h2></p><blockquote><p><i>Abstract:</i> &#8220;This article reports a 2010 empirical study using a 2005 study as a base to compare Google Scholar&#8217;s coverage of scholarly journals with commercial services. Through random samples of eight databases, the author finds that, as of 2010, Google Scholar covers 98 to 100 percent of scholarly journals from both publicly accessible Web contents and from subscription-based databases that Google Scholar partners with. In 2005 the coverage of the same databases ranged from 30 to 88 percent. The author explores de-duplication of search results by Google Scholar and discusses its impacts on searches and library resources. With the dramatic improvement of Google Scholar, the uniqueness and effectiveness of subscription-based abstracts and indexes have dramatically changed.&#8221;</p><p><i>Introduction:</i> &#8220;This empirical study found that more than five years after its debut in November 2004, Google Scholar is able to retrieve any scholarly journal article record from all the publicly accessible Web sites and from subscription-based databases it is allowed to crawl. From February to April 2010, four hundred randomly selected records of scholarly journal articles from eight databases were used in test- searching Google Scholar. Only two records were not retrieved by Google Scholar. The result was 100 percent retrieval for six databases and 98 percent for the other two databases. This is a dramatic improvement compared with some below 50 percent coverage found in 2005. With this kind of coverage improvement by Google Scholar, information professionals should reevaluate its value and values of subscription-based abstracts and indexes.&#8221;</p></blockquote><p>This <a href="http://www.sciencedirect.com/science?_ob=ArticleURL&amp;_udi=B6W63-511KB4Y-1&amp;_user=10&amp;_coverDate=12%2F31%2F2010&amp;_rdoc=1&amp;_fmt=high&amp;_orig=search&amp;_origin=search&amp;_sort=d&amp;_docanchor=&amp;view=c&amp;_acct=C000050221&amp;_version=1&amp;_urlVersion=0&amp;_userid=10&amp;md5=15edb41b063b49e7dc6fcd68b6c2562a&amp;searchtype=a" title="&#038;039;Google Scholar's Dramatic Coverage Improvement Five Years after Debut&#038;039; | Serials Review | ScienceDirect">article</a> by <a href="http://hilltop.bradley.edu/~chen/index.html" title="Xiaotian Chen's homepage">Xiaotian Chen</a> generated <a href="http://friendfeed.com/dltj/f796fded/google-scholar-dramatic-coverage-improvement" title="Google Scholar's Dramatic Coverage Improvement... - Peter Murray - FriendFeed">considerable discussion on FriendFeed</a>.  It certainly seems that on a &#8220;Innovator&#8217;s Dilemma&#8221; scale that Google Scholar has gotten good enough for all but the most demanding users of advanced field searching (e.g. chemical compounds, databases with complex thesauri, etc.).  So, as the author points out, &#8220;libraries can seriously consider cancelling a large number of subscription-based abstracts and indexes since their unique contents and value are rapidly evaporating.&#8221;  The rise of unified index products like Serials Solutions&#8217; Summon, EBSCO&#8217;s Discovery Service and OCLC&#8217;s WorldCat Local also points to this trend (particularly since the first is based on bypassing index databases and going directly to article publishers for metadata).</p><p><h2><a name="effective_meetings">Effective Agile Meetings</a></h2></p><blockquote><p>Meetings are expensive. An all-day team meeting costs thousands of dollars, if we calculate the cost of all the people involved along with overheads. Hence, it is pragmatic to do a fair amount of preparation for the meeting to ensure that the Agile meetings are as effective as possible.</p></blockquote><p><a href="http://www.infoq.com/news/2010/11/effective-agile-meetings" title="InfoQ: Effective Agile Meetings">Here are suggestions</a> from a variety of sources for effective meetings (or, as one commenter says, the elimination of &#8220;meetings&#8221; in favor of &#8220;structured workshops&#8221; as a more effective use of a group&#8217;s time). This comes from the &#8220;Agile software development&#8221; camp, but I find the suggestions are universal.</p><p><h2><a name="librarians_as_obstacles">When Librarians are Obstacles</a></h2></p><blockquote><p>Heading into the Open Ed Conference and especially the Mozilla Drumbeat Festival, I expected to be one of only a handful of librarians participating. Librarians haven’t been terribly involved or engaged with the open education movement, but our values and missions align so well that I expected to be welcomed by the professors and the edupunks as a peer and fellow traveller. Well, I got the first part right – I met only a couple of librarians all week – but the second, not so much. Imagine my surprise when the other two speakers in the session on libraries and the future of [open educational resources] spent much of their time criticizing the ways in which librarians have engaged with open education, and lamenting the possibility of librarians being anything other than a liability.</p></blockquote><p>Molly Kleinman posts about <a href="http://mollykleinman.com/2010/11/16/when-librarians-are-obstacles/" title="When librarians are obstacles | Molly Kleinman">her experiences at an &#8220;open education&#8221; conference</a> with faculty pushing back against a profession viewed as irrelevant and increasingly obsolete.  Librarians as liability?  That may be harsh, but we best not ignore the opinions of a growing number of faculty that are more than willing to meet us half-way on the path towards &#8220;open&#8221; content.</p><p><div id="npr-graphic" class="wp-caption alignright" style="width: 472px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; float: right;"><a href="http://www.npr.org/blogs/money/2010/11/23/131548390/fewer-americans-pay-for-cable-tv" title="Fewer Americans Pay For Cable TV : Planet Money : NPR"><img alt="United States Map showing a predominance of cable TV systems that lost subscribers." src="http://cdn.dltj.org/wp-content/uploads/2010/11/tv-subscription_custom.jpg?t=1290546982&#038;s=3" title="Map of Cable TV Subscribers Added and Lost" width="462" height="307" /></a><p style=' padding: 0 4px 5px; margin: 0;'  class="wp-caption-text">&quot;U.S. media markets, sized to represent their number of video subscribers, and colored to represent whether they had more or fewer subscribers between the first and second quarters of 2010.&quot; Image from the Wall Street Journal via NPR</p></div><h2><a name="cable_tv">Fewer Americans Pay for Cable TV</a></h2></p><blockquote><p>The number of people who pay for TV is falling for the first time, according to research firm SNL Kagan. The total decline was small — 335,000 homes out of 100 million, a mere 0.35% between the first and third quarters this year, the WSJ notes. But the numbers by region are pretty interesting. As this map from the WSJ shows, the number of subscribers didn&#8217;t fall everywhere.</p></blockquote><p>This <a href="http://www.npr.org/blogs/money/2010/11/23/131548390/fewer-americans-pay-for-cable-tv" title="Fewer Americans Pay For Cable TV : Planet Money : NPR">article on NPR&#8217;s Planet Money blog</a> calls out a <a href="http://online.wsj.com/article/SB10001424052748703567304575628831283366798.html" title="Cities Cut Cable Cord - U.S. Media Markets Map - Infographic - WSJ.com">Wall Street Journal article</a> with this map of cable TV subscription trends.  The legend is probably too small to read in this reduced-size version but the red and green circles are proportionally sized to the size of cable TV systems and color-coded for the number of subscribers lost and added by cable companies.  There is a lot of red on that map.  Paired with articles from the New York Times on how <a href="http://www.nytimes.com/2010/11/19/business/media/19warner.html" title="Time Warner Cable to Test Cheaper TV Package | New York Times">Time Warner Cable is testing cheaper cable TV packages</a> in New York City (and <a href="http://gothamist.com/2010/11/19/time_warner_cable_tries_to_appeal_t.php" title="Time Warner Cable Tries To Appeal To &#038;039;Cord Cutters&#038;039; | Gothamist">reportedly in Ohio</a>) and the <a href="http://www.nytimes.com/2010/11/23/technology/23netflix.html" title="Netflix Introduces Online-Only Subscription | New York Times">new Netflix streaming-only plan</a>, we can see that the content delivery landscape is changing.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/thursday-threads-2010w47/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Analysis of Google Scholar and Google Books</title><link>http://dltj.org/article/google-scholar-and-books/</link> <comments>http://dltj.org/article/google-scholar-and-books/#comments</comments> <pubDate>Wed, 15 Aug 2007 20:34:29 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[Directory of Open Access Journals]]></category> <category><![CDATA[ejournal]]></category> <category><![CDATA[Google]]></category> <category><![CDATA[Google Book Search]]></category> <category><![CDATA[Google Scholar]]></category> <category><![CDATA[publishing]]></category><guid isPermaLink="false">http://dltj.org/2007/08/google-scholar-and-books/</guid> <description><![CDATA[Two papers were published recently exploring the quality of Google Scholar and Google Books.Google ScholarPhilipp Mayr and Anne-Kathrin Walter, both of GESIS / Social Science Information Center in Bonn, Germany, uploaded an article to arXiv called &#8220;An exploratory study of &#8230; <a href="http://dltj.org/article/google-scholar-and-books/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/08/google-scholar-and-books/"></abbr><p>Two papers were published recently exploring the quality of <a href="http://scholar.google.com/" title="Google Scholar homepage">Google Scholar</a> and <a href="http://books.google.com/" title="Google Book Search homepage">Google Books</a>.</p><p><br clear="all" /><h2>Google Scholar</h2><br />Philipp Mayr and Anne-Kathrin Walter, both of GESIS / Social Science Information Center in Bonn, Germany, uploaded an article to arXiv called &#8220;<a href="http://arxiv.org/abs/0707.3575" title="arXiv abstract page for &#039;An exploratory study of Google Scholar&#039;">An exploratory study of Google Scholar</a>.&#8221; <sup><a href="http://dltj.org/article/google-scholar-and-books/#footnote_0_275" id="identifier_0_275" class="footnote-link footnote-identifier-link" title="Judging from the citation listed on Philipp Mayr&amp;#8217;s homepage, the article will appear in an upcoming issue of Online Information Review from Emerald Group Publishing.">1</a></sup> Originally created as a presentation for a 2005 conference, it was updated in January 2007 to reflect new findings and published as a paper.  Excerpts from the abstract include:<br /><blockquote>The study shows deficiencies in the coverage and up-to-dateness of the [Google Scholar] index. Furthermore, the study points up which web servers are the most important data providers for this search service and which information sources are highly represented. We can show that there is a relatively large gap in Google Scholar’s coverage of German literature as well as weaknesses in the accessibility of Open Access content. Major commercial academic publishers are currently the main data providers.</p><p>We conclude that Google Scholar has some interesting pros (such as citation analysis and free materials) but the service can not be seen as a substitute for the use of special abstracting and indexing databases and library catalogues due to various weaknesses (such as transparency, coverage and up-to-dateness).</p></blockquote><p>The authors performed a &#8220;brute force analysis&#8221; (their words) of the coverage of Google Scholar by comparing search results by journal title with five journal lists:  ISI Arts &#038; Humanities Citation Index, ISI Social Science Citation Index, ISI Science Citation Index, open access journals listed by <abbr title="Directory of Open Access Journals">DOAJ</abbr>, and journals from the SOLIS database (mainly German-language journals from sociological disciplines).  They queried Google Scholar using the &#8220;Return articles published in&#8230;&#8221; limiter on the advanced search screen, downloaded the first 100 records for each title, then parsed and analyzed each of the records.  In total, 621,000 records from Google Scholar search results were analyzed.</p><p><img src="http://cdn.dltj.org/wp-content/uploads/2007/08/IdentificationOfJournals.png" alt="Number of Articles Found in Google Scholar by Title List" title="Number of Articles Found in Google Scholar by Title List" align="right" width="431" height="291" border="0" style="padding: 0 0 1.5em 2em;" />The authors first determined the coverage of titles in the five journal lists in the Google Scholar database.  The authors note surprise at the relative lack of coverage for open access titles listed in the DOAJ.  I think this can be explained by the fact that many open access publishers are not using a systematic application to put their content on the internet.  Of the 2,804 journals in the DOAJ directory, only 846 are searchable via DOAJ&#8217;s own article-level indexing service.<sup><a href="http://dltj.org/article/google-scholar-and-books/#footnote_1_275" id="identifier_1_275" class="footnote-link footnote-identifier-link" title="Numbers from the DOAJ home page, as of 15-Aug-2007.">2</a></sup> If the journals can&#8217;t be easily harvested at the article level, then they Google can&#8217;t add them to the Scholar article index.</p><p><br clear="all" /><img src="http://cdn.dltj.org/wp-content/uploads/2007/08/DistributionOfDocumentTypes.png" alt="Distribution of Document Types Among the Lists Queried" title="Distribution of Document Types Among the Lists Queried" align="right" width="430" height="291" border="0" style="padding: 0 0 1.5em 2em;" />Based on the semantics provided in each record, the authors divided the results into three categories (referred to in the paper as &#8220;document types&#8221;):  links to complete descriptive records on an external (publisher&#8217;s or aggregator&#8217;s) site, citation-only records (no full-text and no link to more complete information at an external site), and direct access links to full text.  The distribution of results is shown in the table to the right.</p><p>The paper also includes an analysis of the various publisher and portal sites that supply information to Google Scholar&#8217;s index.</p><p><h2>Google Books</h2><br />The August issue of First Monday contains an article by Paul Duguid called &#8220;<a href="http://www.firstmonday.org/issues/issue12_8/duguid/index.html" title="First Monday article: &#039;Inheritance and loss? A brief survey of Google Books&#039;">Inheritance and loss?  A brief survey of Google Books</a>&#8220;.  The article is a somewhat contrived exploration of the Google Books Library Project through his lens of quality assurance derived &#8220;through innovation or through &#8216;inheritance.&#8217;&#8221;  His thesis seems to be that users expect the reputations of the libraries participating in the project (Harvard, University of Michigan, New York Public, Stanford, and Oxford among the <a href="http://books.google.com/googlebooks/partners.html" title="Google Book Search Library Partners">other partners</a> are arguably a reputable group) convey a level of quality to the results of the digitization process in the Google Books Library Project.  Duguid then goes on to pick what arguably has to be the hardest book artifact to capture digitally (various editions of Laurence Sterne&#8217;s &#8220;<a href="http://andromeda.rutgers.edu/~jlynch/Biblio/shandy.html" title="Tristram Shandy: An Annotated Bibliography by Jack Lynch"><i>The Life and Opinions of Tristram Shandy, Gentleman</i></a>&#8220;) as an example of everything that is wrong with Google Books.</p><p>I don&#8217;t subscribe to that notion at all, but it is perhaps because I&#8217;ve been around enough technology and innovation to know that each new service needs to stand on its own. <i>Tristram Shandy</i> is in part an experiment in typography and layout by the author, as Duguid describes in detail in this article, that is unusual and atypical to the extreme, so I think many of the characterizations of the Google Books project, based on this one artifact, are unfair and short-sighted.  When you strip away the false dichotomy of innovative-or-inherited-quality, the oddities surrounding the <i>Tristram Shandy</i> artifact, and various unnecessary pot-shots<sup><a href="http://dltj.org/article/google-scholar-and-books/#footnote_2_275" id="identifier_2_275" class="footnote-link footnote-identifier-link" title="&amp;#8220;A quick look at the online catalogue for Stanford&rsquo;s library shows that the Stanford volume presented as your second choice by Google Books is actually tucked away in the Stanford Auxiliary library along with &ldquo;infrequently&ndash;used&rdquo; texts.&amp;#8221;">3</a></sup> Duguid&#8217;s analysis does point to some apparent problems with Google&#8217;s scheme for digitizing and indexing books.  The quality of some of the scans pointed out in the <i>Tristram Shandy</i> artifact and others are sources of concern.  Substandard metadata is another:<br /><blockquote>Not a word is mentioned about multiple volumes or volume number. Indeed, a quick survey of the Google Book Project suggests that Google doesn’t recognize volume numbers. Not only are the different editions (Harvard’s from 1896, Stanford’s from 1904) given exactly the same name, but also the different volumes of this Stanford’s multivolume edition are labeled identically. Consequently, whatever algorithm Google uses to find the book, it is quite likely, as in this case, to offer volume II first.</p></blockquote><p>Reservations aside, it is a good review the some of the problematic outcomes of the Google Books Library Project.</p><h2>Footnotes</h2><ol class="footnotes"><li id="footnote_0_275" class="footnote">Judging from the citation listed on <a href="http://www.gesis.org/IZ/Mayr/" title="" class="broken_link" rel="nofollow">Philipp Mayr&#8217;s homepage</a>, the article will appear in an upcoming issue of Online Information Review from Emerald Group Publishing.</li><li id="footnote_1_275" class="footnote">Numbers from the <a href="http://www.doaj.org/" title="Directory of Open Access Journals homepage">DOAJ home page</a>, as of 15-Aug-2007.</li><li id="footnote_2_275" class="footnote">&#8220;A quick look at the online catalogue for Stanford’s library shows that the Stanford volume presented as your second choice by Google Books is actually tucked away in the Stanford Auxiliary library along with “infrequently–used” texts.&#8221;</li></ol>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/google-scholar-and-books/feed/</wfw:commentRss> <slash:comments>5</slash:comments> </item> </channel> </rss>
<!-- Served from: dltj.org @ 2012-02-11 12:31:33 by W3 Total Cache -->
