<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"><channel><title>Disruptive Library Technology Jester &#187; oai-pmh</title> <atom:link href="http://dltj.org/tag/oai-pmh/feed/" rel="self" type="application/rss+xml" /><link>http://dltj.org</link> <description>We&#039;re Disrupted, We&#039;re Librarians, and We&#039;re Not Going to Take It Anymore</description> <lastBuildDate>Mon, 06 Feb 2012 20:04:22 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <cloud domain='dltj.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' /> <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/3.0/us/</creativeCommons:license> <item><title>Thumbgrabber: a metadata augmentation tool</title><link>http://dltj.org/article/thumbgrabber-from-uiuc/</link> <comments>http://dltj.org/article/thumbgrabber-from-uiuc/#comments</comments> <pubDate>Tue, 29 Apr 2008 20:21:00 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[description]]></category> <category><![CDATA[imaging]]></category> <category><![CDATA[metadata]]></category> <category><![CDATA[oai-pmh]]></category> <category><![CDATA[paper]]></category> <category><![CDATA[UIUC]]></category><guid isPermaLink="false">https://dltj.org/?p=353</guid> <description><![CDATA[In reading a background paper for the American Social History Online portal, I was reacquainted with a paper by Muriel Foulonneau, Thomas Habing and Tim Cole from UIUC called &#8220;Automated Capture of Thumbnails and Thumbshots for Use by Metadata Aggregation &#8230; <a href="http://dltj.org/article/thumbgrabber-from-uiuc/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="https://dltj.org/?p=353"></abbr><p><span style="float: right; padding: 5px;"><a href="http://www.researchblogging.org" title="Research Blogging"><img alt="Blogging on Peer Review Research" src="http://cdn.dltj.org/wp-content/uploads/2008/04/ResearchBlogging-Medium-Trans.png" width="80" height="50" /></a></span>In reading a background paper for the American Social History Online portal, I was reacquainted with a paper by Muriel Foulonneau, Thomas Habing and Tim Cole from UIUC called &#8220;Automated Capture of Thumbnails and Thumbshots for Use by Metadata Aggregation Services.&#8221;<sup><a href="http://dltj.org/article/thumbgrabber-from-uiuc/#footnote_0_353" id="identifier_0_353" class="footnote-link footnote-identifier-link" title="Foulonneau, M., Habing, T.G., Cole, T.W. (2006). Automated Capture of Thumbnails and Thumbshots for Use by Metadata Aggregation Services. D-Lib Magazine, 12(1) DOI: 10.1045/january2006-foulonneau">1</a></sup> This is the abstract:<br /><blockquote>The practice of including thumbnails in short record displays, increasingly common in local implementations, is being adopted by metadata aggregation service providers as well. In addition, thumbnails and Web thumbshots have begun appearing as part of Web search results. This article reports on a project at the University of Illinois at Urbana-Champaign (UIUC) to make more comprehensible heterogeneous resources available on the UIUC CIC metadata portal by incorporating thumbnails and thumbshots of image and Webpage resources in the context of the OAI Protocol for Metadata Harvesting. In addition to thumbnails provided by partner data providers, UIUC has developed an automated process to generate thumbnails and thumbshots from the Webpages resources pointed to by the metadata records.</p></blockquote><p>The paper cites dissatisfaction with results from metadata portals that consist exclusively of textual descriptions of the objects.  It also cites studies that show the addition of thumbnail images to the results display improves user satisfaction.  With that in mind, UIUC wrote <span class="removed_link" title="http://cicharvest.grainger.uiuc.edu/thumb.asp">Thumbgrabber</span> &#8212; a Windows application written in Visual Basic that uses Internet Explorer to find images in websites and/or take image snapshots of web pages as they have been rendered.  In the UIUC context, the application is fed URLs from records harvested via OAI-PMH, although it would seem like it would be able to process any arbitrary list of URLs.</p><p>This is a useful tool to keep in mind as we think more about aggregating the metadata records into vertical (subject-specific) portals and repurpose metadata records in other ways.<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://cicharvest.grainger.uiuc.edu/thumb.asp on January 28th, 2011.</p><h2>Footnotes</h2><ol class="footnotes"><li id="footnote_0_353" class="footnote"><span class="Z3988" title="ctx_ver=Z39.88-2004&#038;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&#038;rft.aulast=Foulonneau&#038;rft.aufirst=Muriel&#038;rft.au=Muriel+ Foulonneau&#038;rft.au=Thomas+Habing&#038;rft.au=Timothy+Cole&#038;rft.title=D-Lib+Magazine&#038;rft.atitle=Automated+Capture+of+Thumbnails+and+Thumbshots+for+Use+by+Metadata+Aggregation+Services&#038;rft.date=2006&#038;rft.volume=12&#038;rft.issue=1&#038;rft.spage=&#038;rft.genre=article&#038;rft.id=info:DOI/10.1045%2Fjanuary2006-foulonneau"></span>Foulonneau, M., Habing, T.G., Cole, T.W. (2006). Automated Capture of Thumbnails and Thumbshots for Use by Metadata Aggregation Services. <span style="font-style: italic;">D-Lib Magazine, 12</span>(1) DOI: <a rev="review" href="http://dx.doi.org/10.1045/january2006-foulonneau" title="Handle Redirect">10.1045/january2006-foulonneau</a></li></ol>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/thumbgrabber-from-uiuc/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Article-Level OAI-PMH Harvest Available from DOAJ</title><link>http://dltj.org/article/doaj-articles/</link> <comments>http://dltj.org/article/doaj-articles/#comments</comments> <pubDate>Wed, 11 Jul 2007 20:51:54 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[description]]></category> <category><![CDATA[Directory of Open Access Journals]]></category> <category><![CDATA[ejournal]]></category> <category><![CDATA[oai-pmh]]></category> <category><![CDATA[open access]]></category><guid isPermaLink="false">http://dltj.org/2007/07/doaj-articles/</guid> <description><![CDATA[Earlier this year the DOAJ began offering a new schema for registered articles that significantly improves the value of OAI-PMH harvested article content. Prior to this addition the only scheme available was Dublin Core, which as a metadata schema for &#8230; <a href="http://dltj.org/article/doaj-articles/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/07/doaj-articles/"></abbr><p>Earlier this year the <a href="http://www.doaj.org/doaj?func=loadTempl&#038;templ=070509" title="Article-level OAI-PMH announcement on DOAJ website"><abbr title="Directory of Open Access Journals">DOAJ</abbr> began offering a new schema for registered articles</a> that significantly improves the value of OAI-PMH harvested article content.  Prior to this addition the only scheme available was Dublin Core, which as a metadata schema for describing article content is woefully inadequate.  (Dublin Core, of course, was never designed to handle the complexity of the description of an average article.)  The <a href="http://www.doaj.org/schemas/doajArticles.xsd" title="doajArticles&#039; XML schema">new schema</a> (graphically represented here<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/07/doajArticles_schema_image1.png" rel="lightbox"><img src="http://cdn.dltj.org/wp-content/uploads/2007/07/.doajArticles_schema_image.png" alt="doajArticles schema image" title="doajArticles schema image" align="right" width="112" height="146" border="0" /></a> &#8212; select thumbnail to see a larger version) includes elements for ISSN/eISSN, volume/issue, start/end page numbers, and author affiliation.  There is also a <code>&lt;fullTextUrl&gt;</code> element that is a link to the article content itself (not the splash page of the article on the publisher&#8217;s site).</p><p>Article content using this schema is harvestable through the DOAJ OAI-PMH provider site (for instance, using a <a href="http://www.doaj.org/oai.article?verb=ListRecords&#038;metadataPrefix=doajArticle" title="XML harvest of the latest articles added to the DOAJ article archive"><code>ListRecords</code> verb with a <code>doajArticle</code> metadata prefix</a> against the PMH URL).  This is, in fact, the same schema journal publishers use to submit article content to the DOAJ article database.  With these pieces in place, it is now conceivable to harvest open access journal article content through the DOAJ and add it to a local journal article repository (such as the <a href="http://journals.ohiolink.edu/ejc/article.cgi?issn=14649055&#038;issue=v25i0002&#038;article=191_etoe" title="Journals: the OhioLINK experience&#039; article record in OhioLINK EJC">Electronic Journal Center</a> in the case of OhioLINK).</p><p>Thanks go out to the DOAJ folks for making this available!</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/doaj-articles/feed/</wfw:commentRss> <slash:comments>7</slash:comments> </item> <item><title>A Report on Namespaces Used by OAI-PMH Repositories</title><link>http://dltj.org/article/oai-pmh-namespaces/</link> <comments>http://dltj.org/article/oai-pmh-namespaces/#comments</comments> <pubDate>Tue, 20 Mar 2007 21:00:43 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Linking Technologies]]></category> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[digital libraries]]></category> <category><![CDATA[Dublin Core]]></category> <category><![CDATA[libraries]]></category> <category><![CDATA[MARC]]></category> <category><![CDATA[metadata]]></category> <category><![CDATA[oai-pmh]]></category> <category><![CDATA[standards]]></category><guid isPermaLink="false">http://dltj.org/2007/03/oai-pmh-namespaces/</guid> <description><![CDATA[I had a need for a survey of the metadata namespaces used by OAI-PMH repositories, so I wrote up a quick shell script and XSLT style sheet to parse through the list of Registered Data Providers at the OpenArchives.org website. &#8230; <a href="http://dltj.org/article/oai-pmh-namespaces/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/03/oai-pmh-namespaces/"></abbr><p>I had a need for a survey of the metadata namespaces used by <a href="http://www.openarchives.org/pmh/" title="Open Archives Initiative Protocol for Metadata Harvesting homepage">OAI-PMH</a> repositories, so I wrote up a quick shell script and XSLT style sheet to parse through the list of <a href="http://www.openarchives.org/Register/BrowseSites" title="Registered OAI-PMH Data Providers">Registered Data Providers</a> at the OpenArchives.org website.  The <span class="removed_link" title="http://dltj.org/misc/oai-pmh-namespace-report.html">results of this effort</span> are pretty interesting.  Some of them:</p><ul><li>Dublin Core is, as you would expect, the highest-used descriptive metadata standard.  Every service &mdash; or at least those that reported using any namespace at all &mdash; reported Dublin Core as a record harvesting option.  For some, it was the <em>only</em> option (which I find rather sad).  One problem, though, comes in with the variety of namespace URIs declared that all appear to be semantically the same thing: <tt>http://www.openarchives.org/OAI/2.0/oai_dc/</tt>, <tt>http://www.openarchives.org/OAI/2.0/oai_dc</tt> (note the missing trailing slash), <tt>http://purl.org/dc/elements/2.0/</tt> (used exclusively by the <span class="removed_link" title="http://www.umi.com/umi/digitalcommons/">ProQuest Digital Commons product</span>, it would seem), and <tt>http://purl.org/dc/elements/1.1/</tt> (the difference between 2.0 and 1.1 is not clear to me).  In order to be processable, there must be an exact string match of the namespace URI &#8212; so even that missing trailing slash is significant!</li><li>The next most popular namespace URI is <tt>http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1807.txt</tt>, which semantically would seem to identify the <a href="http://www.faqs.org/rfcs/rfc1807.html" title="RFC 1807 (rfc1807) - A Format for Bibliographic Records">IETF RFC 1807 on a Format for Bibliographic Records</a>.  You can <a href="http://etd.library.pitt.edu/ETD-db/NDLTD-OAI2/oai.pl?verb=GetRecord&#038;metadataPrefix=oai_rfc1807&#038;identifier=oai%3APITETD%3Aetd-11272006-155805" title="">see what one of these things looks like</a> &#8212; although RFC1807 predates XML (it was approved by the IETF in mid-1995), it looks like someone turned the metadata format into XML along the way.  Very interesting&#8230;</li><li>The next most popular is <tt>http://www.ndltd.org/standards/metadata/etdms/1.0/</tt> &mdash; corresponding to the <a href="http://www.ndltd.org/standards/metadata/etd-ms-v1.00-rev2.html/" title="Networked Digital Library of Theses and Dissertations Metadata Standard">Interoperability Metadata Standard for Electronic Theses and Dissertations</a> &mdash; followed closely by <tt>http://www.openarchives.org/OAI/1.1/oai_marc</tt> &mdash; which <a href="http://www.openarchives.org/OAI/2.0/guidelines-marcxml.htm" title="OAI-PMH Implementation Guidelines: A recommended XML Schema to represent MARC21 records">fell out of favor years ago</a> with the publication of <a href="http://www.loc.gov/standards/marcxml/" title="MARC 21 XML Schema">MARC21</a> by the Library of Congress (which goes by the namespace <tt>http://www.loc.gov/MARC21/slim</tt>).  Unfortunately, it doesn&#8217;t seem to have been picked up by the majority of OAI-PMH data providers that used the older oai_marc schema.</li><li>As you get towards the bottom of the first list, there are all sorts of interesting variants on qualified Dublin Core and other one-off schemas.</li></ul><p>Your thoughts and observations?  I&#8217;ve filed away the UNIX script and XSLT style sheet.  If there is interest in seeing something like this in the future, let me know and I can dig them out.<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://dltj.org/misc/oai-pmh-namespace-report.html on December 30th, 2010.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://www.umi.com/umi/digitalcommons/ on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://www.ndltd.org/standards/metadata/current.html to http://www.ndltd.org/standards/metadata/etd-ms-v1.00-rev2.html/ on January 19th, 2011.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/oai-pmh-namespaces/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> </channel> </rss>
<!-- Served from: dltj.org @ 2012-02-11 10:46:40 by W3 Total Cache -->
