<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"	> <channel><title>Comments on: Analysis of CDL&#8217;s XTF textIndexer to Replace the Local Files with FEDORA Objects</title> <atom:link href="http://dltj.org/article/xtf-fedora-2/feed/" rel="self" type="application/rss+xml" /><link>http://dltj.org/article/xtf-fedora-2/</link> <description>We&#039;re Disrupted, We&#039;re Librarians, and We&#039;re Not Going to Take It Anymore</description> <lastBuildDate>Wed, 08 Feb 2012 17:48:39 +0000</lastBuildDate> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <item><title>By: SourceForge.net: xtf-user</title><link>http://dltj.org/article/xtf-fedora-2/comment-page-1/#comment-3900</link> <dc:creator>SourceForge.net: xtf-user</dc:creator> <pubDate>Wed, 13 Sep 2006 18:48:31 +0000</pubDate> <guid isPermaLink="false">http://dltj.org/2006/08/xtf-fedora-2/#comment-3900</guid> <description>&lt;!--%kramer-ref-pre%--&gt;[...] To the FEDORA and XTF communities -- At OhioLINK, we&#039;re aggressively pursing the California Digital Library&#039;s XTF software as a front end for our digital collections in a FEDORA content repository. I&#039;ve written up some observations about what XTF integrated with FEDORA might look like and would welcome your comments and observations. We&#039;d particularly like to know if anyone else is pursuing as similar path. The URLs are: http://dltj.org/2006/08/xtf-fedora-1/ http://dltj.org/2006/08/xtf-fedora-2/ Public comments (in the form of responses on the blog) or private ones (e-mail replies) would be most appreciated. Martin Haye, one of the lead developers of XTF, has been kind enough to offer some replies already and so far this seems like a viable solution. Peter &#160; [...]&lt;!--%kramer-ref-post%--&gt;</description> <content:encoded><![CDATA[<p><img src="http://cdn.dltj.org/wp-content/plugins/kramer/kramer.gif" class="technorati-balloon" alt="Kramer auto Pingback" style="border:0;" />[...] To the FEDORA and XTF communities &#8212; At OhioLINK, we&#8217;re aggressively pursing the California Digital Library&#8217;s XTF software as a front end for our digital collections in a FEDORA content repository. I&#8217;ve written up some observations about what XTF integrated with FEDORA might look like and would welcome your comments and observations. We&#8217;d particularly like to know if anyone else is pursuing as similar path. The URLs are: <a href="http://dltj.org/2006/08/xtf-fedora-1/" rel="nofollow">http://dltj.org/2006/08/xtf-fedora-1/</a> <a href="http://dltj.org/2006/08/xtf-fedora-2/" rel="nofollow">http://dltj.org/2006/08/xtf-fedora-2/</a> Public comments (in the form of responses on the blog) or private ones (e-mail replies) would be most appreciated. Martin Haye, one of the lead developers of XTF, has been kind enough to offer some replies already and so far this seems like a viable solution. Peter &nbsp; [...]</p> ]]></content:encoded> </item> <item><title>By: the jester</title><link>http://dltj.org/article/xtf-fedora-2/comment-page-1/#comment-3015</link> <dc:creator>the jester</dc:creator> <pubDate>Wed, 23 Aug 2006 13:57:43 +0000</pubDate> <guid isPermaLink="false">http://dltj.org/2006/08/xtf-fedora-2/#comment-3015</guid> <description>[quote comment=&quot;2994&quot;]One thought on lazy files: if you have some mechanism for random byte-level access to a data stream in the Fedora object, you could supply that to XTF through the StructuredStore interface. The interface was designed with this in mind, though we haven&#039;t ever used it for that.[/quote]Ah, that is a bit of a problem -- at present, access to objects in FEDORA requires a Web Services call and the calls to retrieve data (through a &quot;disseminator&quot; or directly a datastream) do not support a byte-range request.  (There has been &lt;a href=&quot;http://article.gmane.org/gmane.comp.cms.fedora-commons.user/1728/&quot; rel=&quot;nofollow&quot;&gt;some talk&lt;/a&gt; about this, so it might get done some day.)[quote comment=&quot;2994&quot;]What you call a &quot;disseminator&quot; is similar to subclasses of IndexSource, such as XMLIndexSource. There is also a PDFIndexSource and HTMLIndexSource that you might be interested in. If you end up creating others for Word or other formats, perhaps you&#039;d consider contributing them back into XTF.[/quote]Exactly -- the point of distinction is where the transformation to XML takes place...as a subclass of IndexSource in XTF or as a function of the repository (the disseminator).  In either case, the transformation is happening in a piece of Java code.  My inclination, probably borne out by just raw familiarity at the moment, is to do the transformation as a FEDORA disseminator, but the code would certainly be used to write a new IndexSource class.  I&#039;ll let you know if we come up with anything new that might be useful to you.Thank you for your comments!  They have been most helpful.&lt;p style=&quot;padding:0;margin:0;font-style:italic;&quot;&gt;The text was modified to update a link from http://comm.nsdl.org/pipermail/fedora-users/2006-May/001723.html to http://article.gmane.org/gmane.comp.cms.fedora-commons.user/1728/ on February 12th, 2011.&lt;/p&gt;</description> <content:encoded><![CDATA[<p>[quote comment="2994"]One thought on lazy files: if you have some mechanism for random byte-level access to a data stream in the Fedora object, you could supply that to XTF through the StructuredStore interface. The interface was designed with this in mind, though we haven&#8217;t ever used it for that.[/quote]</p><p>Ah, that is a bit of a problem &#8212; at present, access to objects in FEDORA requires a Web Services call and the calls to retrieve data (through a &#8220;disseminator&#8221; or directly a datastream) do not support a byte-range request.  (There has been <a href="http://article.gmane.org/gmane.comp.cms.fedora-commons.user/1728/" rel="nofollow">some talk</a> about this, so it might get done some day.)</p><p>[quote comment="2994"]What you call a &#8220;disseminator&#8221; is similar to subclasses of IndexSource, such as XMLIndexSource. There is also a PDFIndexSource and HTMLIndexSource that you might be interested in. If you end up creating others for Word or other formats, perhaps you&#8217;d consider contributing them back into XTF.[/quote]</p><p>Exactly &#8212; the point of distinction is where the transformation to XML takes place&#8230;as a subclass of IndexSource in XTF or as a function of the repository (the disseminator).  In either case, the transformation is happening in a piece of Java code.  My inclination, probably borne out by just raw familiarity at the moment, is to do the transformation as a FEDORA disseminator, but the code would certainly be used to write a new IndexSource class.  I&#8217;ll let you know if we come up with anything new that might be useful to you.</p><p>Thank you for your comments!  They have been most helpful.<p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from <a href="http://comm.nsdl.org/pipermail/fedora-users/2006-May/001723.html" rel="nofollow">http://comm.nsdl.org/pipermail/fedora-users/2006-May/001723.html</a> to <a href="http://article.gmane.org/gmane.comp.cms.fedora-commons.user/1728/" rel="nofollow">http://article.gmane.org/gmane.comp.cms.fedora-commons.user/1728/</a> on February 12th, 2011.</p> ]]></content:encoded> </item> <item><title>By: Martin Haye</title><link>http://dltj.org/article/xtf-fedora-2/comment-page-1/#comment-2994</link> <dc:creator>Martin Haye</dc:creator> <pubDate>Tue, 22 Aug 2006 23:04:49 +0000</pubDate> <guid isPermaLink="false">http://dltj.org/2006/08/xtf-fedora-2/#comment-2994</guid> <description>I agree with your strategy. However you get a set of documents from Fedora (I&#039;m not that familiar with it), you basically want to replicate the functionality of SrcTreeProcessor, which does the work of wrapping the input sources and passing them to XMLTextProcessor for the heavy lifting.One thought on lazy files: if you have some mechanism for random byte-level access to a data stream in the Fedora object, you could supply that to XTF through the StructuredStore interface. The interface was designed with this in mind, though we haven&#039;t ever used it for that.What you call a &quot;disseminator&quot; is similar to subclasses of IndexSource, such as XMLIndexSource. There is also a PDFIndexSource and HTMLIndexSource that you might be interested in. If you end up creating others for Word or other formats, perhaps you&#039;d consider contributing them back into XTF.</description> <content:encoded><![CDATA[<p>I agree with your strategy. However you get a set of documents from Fedora (I&#8217;m not that familiar with it), you basically want to replicate the functionality of SrcTreeProcessor, which does the work of wrapping the input sources and passing them to XMLTextProcessor for the heavy lifting.</p><p>One thought on lazy files: if you have some mechanism for random byte-level access to a data stream in the Fedora object, you could supply that to XTF through the StructuredStore interface. The interface was designed with this in mind, though we haven&#8217;t ever used it for that.</p><p>What you call a &#8220;disseminator&#8221; is similar to subclasses of IndexSource, such as XMLIndexSource. There is also a PDFIndexSource and HTMLIndexSource that you might be interested in. If you end up creating others for Word or other formats, perhaps you&#8217;d consider contributing them back into XTF.</p> ]]></content:encoded> </item> <item><title>By: Martin Haye</title><link>http://dltj.org/article/xtf-fedora-2/comment-page-1/#comment-2993</link> <dc:creator>Martin Haye</dc:creator> <pubDate>Tue, 22 Aug 2006 22:52:49 +0000</pubDate> <guid isPermaLink="false">http://dltj.org/2006/08/xtf-fedora-2/#comment-2993</guid> <description>One clarification before I get to general comments: processFile() forms the path for the lazy tree, but the file isn&#039;t actually created until the queued documents are processed.</description> <content:encoded><![CDATA[<p>One clarification before I get to general comments: processFile() forms the path for the lazy tree, but the file isn&#8217;t actually created until the queued documents are processed.</p> ]]></content:encoded> </item> </channel> </rss>
<!-- Served from: dltj.org @ 2012-02-11 12:50:41 by W3 Total Cache -->
