<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"><channel><title>Disruptive Library Technology Jester &#187; DRC</title> <atom:link href="http://dltj.org/tag/drc/feed/" rel="self" type="application/rss+xml" /><link>http://dltj.org</link> <description>We&#039;re Disrupted, We&#039;re Librarians, and We&#039;re Not Going to Take It Anymore</description> <lastBuildDate>Mon, 06 Feb 2012 20:04:22 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <cloud domain='dltj.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' /> <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/3.0/us/</creativeCommons:license> <item><title>Position Announcement: OhioLINK Systems Developer</title><link>http://dltj.org/article/ohiolink-systems-developer-sought/</link> <comments>http://dltj.org/article/ohiolink-systems-developer-sought/#comments</comments> <pubDate>Fri, 05 Sep 2008 17:59:25 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[OhioLINK]]></category> <category><![CDATA[DRC]]></category><guid isPermaLink="false">http://dltj.org/?p=467</guid> <description><![CDATA[The Ohio Library and Information Network (OhioLINK) is seeking a hard-working, analytical individual to participate in the creation and maintenance of our internationally recognized set of online library information services, with special focus on the Ohio Digital Resource Commons. OhioLINK &#8230; <a href="http://dltj.org/article/ohiolink-systems-developer-sought/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=467"></abbr><p>The Ohio Library and Information Network (<a href="http://www.ohiolink.edu/" title="OhioLINK &amp;ndash; The Ohio Library and Information Network">OhioLINK</a>) is seeking a hard-working, analytical individual to participate in the creation and maintenance of our internationally recognized set of online library information services, with special focus on the <a href="http://drc.ohiolink.edu/" title="DRC Home">Ohio Digital Resource Commons</a>. OhioLINK serves the higher education population in the State of Ohio with over <a href="http://www.ohiolink.edu/members-info/" title="File Not Found">85 college and university member institutions</a>.</p><p>The position requires a four-year degree in Computer Science, or a graduate degree in Information or Library Science, or equivalent technical experience. The candidate should have strong programming skills in languages such as Java, and should be comfortable working in a Unix/Linux environment with open source software. Experience with the following is highly valued: Digital Repositories, Cocoon, Apache Tomcat, XML/XSLT, PostgreSQL. Experience with the following is desirable: DSpace/Manakin, HTML/CSS site design, metadata, Subversion, Perl, shell scripting.</p><p>Salary:  $49,000 minimum</p><p>If you are interested in this position, please send a resume, a summary statement of experience, and an indication of your salary expectations to <a href="mailto:resume@ohiolink.edu">resume@ohiolink.edu</a>.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://www.ohiolink.edu/member-info/ to http://www.ohiolink.edu/members-info/ on January 13th, 2011.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/ohiolink-systems-developer-sought/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> <item><title>Two Personal Repository Services</title><link>http://dltj.org/article/two-personal-repository-services/</link> <comments>http://dltj.org/article/two-personal-repository-services/#comments</comments> <pubDate>Mon, 04 Jun 2007 03:03:23 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[DRC]]></category> <category><![CDATA[Unified Content Repository]]></category> <category><![CDATA[eprints]]></category> <category><![CDATA[jisc]]></category> <category><![CDATA[preservation]]></category><guid isPermaLink="false">http://dltj.org/2007/06/two-personal-repository-services/</guid> <description><![CDATA[This year has seen the release of two personal repository services: http://PublicationsList.org/ and the U.K. Depot. These two services have an admittedly different focus, but I think it is still interesting to compare and contrast them to see what we &#8230; <a href="http://dltj.org/article/two-personal-repository-services/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/06/two-personal-repository-services/"></abbr><p>This year has seen the release of two personal repository services: <a href="http://publicationslist.org/" title="Homepage: PublicationsList.org">http://PublicationsList.org/</a> and the <a href="http://depot.edina.ac.uk/" title="Homepage: The Depot">U.K. Depot</a>.  These two services have an admittedly different focus, but I think it is still interesting to compare and contrast them to see what we can learn.<br /><span id="more-244"></span><br /><h2>The Depot</h2><br /><i>The Depot</i> provides one-stop place for U.K.-based researchers to deposit refereed articles, book chapters, and conference papers.  It is &#8220;one-stop&#8221; in that The Depot can forward the author to his/her institution-based repository <em>or</em>, in the case where the author&#8217;s institution does not have a repository, upload and host the content right from The Depot.</p><p>The deposit interface, for those putting content directly into the centralized Depot repository, has four main stages.  First, the &#8220;Type&#8221; stage, specifying whether the object is an article, a book chapter, or a conference paper: <img src="http://cdn.dltj.org/wp-content/uploads/2007/06/depot-01-type.png" alt="The Depot - “Type” Screen" /></p><p>Next, the &#8220;Upload&#8221; stage, where one can upload the file and supply a few more properties: <img src="http://cdn.dltj.org/wp-content/uploads/2007/06/depot-02-upload.png" alt="The Depot - “Upload” Screen" /></p><p>Then the &#8220;Details&#8221; stage, where the descriptive metadata (minus the controlled vocabulary subjects &#8212; that comes in the next screen) is input: <img src="http://cdn.dltj.org/wp-content/uploads/2007/06/depot-03-details.png" alt="The Depot - “Details” Screen" /></p><p>And finally, the &#8220;Subjects&#8221; page, with an AJAX-driven expanding-and-collapsing hierarchy of subjects:<img src="http://cdn.dltj.org/wp-content/uploads/2007/06/depot-04-subjects.png" alt="The Depot - “Subjects” Screen" /></p><p>To retrieve contents from the repository, there is a &#8220;<a href="http://deposit.depot.edina.ac.uk/view/" title="Browse Items - the Depot">browse</a>&#8221; interface for looking by &#8216;year&#8217; or by &#8216;subject&#8217; &#8212; no other browse facets and no search interface.  The Depot was just formally released this month, so I would bet that functionality like that is in the works.</p><p><h2>PublicationsList.org</h2><br /><a href="http://publicationslist.org/" title="Homepage: PublicationsList">PublicationsList</a> is a commercial service with a free, limited-functionality version.  Unlike The Depot (and similar institutional repository systems), the focus is on putting together and publishing a personal bibliography with the deposit function taking a secondary role (and only for paid subscribers of the service).</p><p>The single item entry page is a just-the-facts interface.  Note that the content hosting service is only available to those who have upgraded to the &#8220;Publications List Professional&#8221; version (which <a href="http://publicationslist.org/faq.html" title="Publications List FAQ">costs</a> £9.99, or approx $20/€15, per year).<br /><img src="http://cdn.dltj.org/wp-content/uploads/2007/06/single-item-reference-entry.png" alt="PublicationList single item entry" /></p><p>The system can also accept a variety of citation manager file formats for bulk entry. (See snapshot to the right.) <img src="http://cdn.dltj.org/wp-content/uploads/2007/06/import-references.png" alt="PublicationsList Import" style="float: right;" /> PublicationsList also has a built in <a href="http://publicationslist.org/pubmed.html" title="PubMed - keep your online publications list up to date with import from NLM / NIH PubMed / MEDLINE">search-and-select interface to PubMed</a> for finding publications matching your name and automatically populating the metadata fields in your personal citation.</p><p>Then end result is a web-based bibliography with links to the publications (either hosted on PublicationsList or on other sites).  The free version is hosted on PublicationsList.org (see the <a href="http://publicationslist.org/rcc" title="rcc - Publications List">service founder&#8217;s </a>page as an example) and the professional version can <a href="http://publicationslist.org/embed.html" title="Embedding a publications list in another web page">embed the publications list in your own page</a>.</p><p>PublicationsList does provide discounts and additional functionality for <a href="http://publicationslist.org/group.html" title="Register a group publications list">groups</a> (such as departments, research centers, etc.).<br /><br clear="all" /></p><p><h2>Observations</h2><br />Both The Depot and PublicationsList provide interesting suites of features for academics seeking to get their content online, but neither really addresses the problems of getting academics to put their content online. <sup><a href="http://dltj.org/article/two-personal-repository-services/#footnote_0_244" id="identifier_0_244" class="footnote-link footnote-identifier-link" title="For a really good discussion of that problem, see Davis, P.M., &amp;#038; Connolly, M.J.L. (2007). Institutional Repositories: Evaluating the Reasons for Non-use of Cornell University&amp;#8217;s Installation of DSpace. D-Lib Magazine, 13(3/4). Retrieved March 14, 2007, from http://www.dlib.org/dlib/march07/davis/03davis.html.">1</a></sup> The search-and-select interface for PubMed is very helpful in cutting down on the data entry required to populate a citation entry.  If OhioLINK were to replicate this service, we could tap into not only PubMed but also the wide variety of index/abstract databases and electronic journals that we host.  The automatic handling of various forms of citation management data is also nice.  I don&#8217;t think PublicationsList offers an <em>export</em> feature, which would be good to have so that an author can add entries found through the search-and-select interface back into their personal bibliographic management software.</p><p>The one-stop, redirection service in The Depot is a good concept, too. <em>If</em> a researcher wanted to deposit their content in a repository and they weren&#8217;t sure if their institution had a repository to hold it, OhioLINK would be a natural place to look for a content hosting service in the state and we could redirect the author to the appropriate location on a campus.  OhioLINK could also be playing the role of repository-of-last-resort for Ohio academic researchers by providing a space and services for published content, whether or not the institution in question has set up a formal repository space on the DRC.</p><h2>Footnotes</h2><ol class="footnotes"><li id="footnote_0_244" class="footnote">For a really good discussion of that problem, see Davis, P.M., &#038; Connolly, M.J.L. (2007). Institutional Repositories: Evaluating the Reasons for Non-use of Cornell University&#8217;s Installation of DSpace. <i>D-Lib Magazine</i>, 13(3/4). Retrieved March 14, 2007, from <a href="http://www.dlib.org/dlib/march07/davis/03davis.html" title="Article: Institutional Repositories: Evaluating the Reasons for Non-use of Cornell University&#039;s Installation of DSpace">http://www.dlib.org/dlib/march07/davis/03davis.html</a>.</li></ol>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/two-personal-repository-services/feed/</wfw:commentRss> <slash:comments>10</slash:comments> </item> <item><title>Disseminators As the Core of an Object Repository</title><link>http://dltj.org/article/disseminator-centric-repository/</link> <comments>http://dltj.org/article/disseminator-centric-repository/#comments</comments> <pubDate>Thu, 26 Apr 2007 13:58:32 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[DRC]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[asset actions]]></category> <category><![CDATA[digital libraries]]></category> <category><![CDATA[java]]></category> <category><![CDATA[restlet]]></category> <category><![CDATA[seam]]></category><guid isPermaLink="false">http://dltj.org/2007/04/disseminator-centric-repository/</guid> <description><![CDATA[I&#8217;ve been working to get JBoss Seam tied into Fedora, and along the way thought it would be wise to stop and document a core concept of this integration: the centrality of Fedora Disseminators in the the design of the &#8230; <a href="http://dltj.org/article/disseminator-centric-repository/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/04/disseminator-centric-repository/"></abbr><p>I&#8217;ve been working to get JBoss Seam tied into <a href="http://www.fedora.info/" title="Fedora Digital Object Repository homepage">Fedora</a>, and along the way thought it would be wise to stop and document a core concept of this integration:  the centrality of <a href="http://www.fedora.info/download/2.2/userdocs/digitalobjects/objectModel.html#DISS" title="Overview:The Fedora Digital Object Model">Fedora Disseminators</a> in the the design of the <a href="http://info.drc.ohiolink.edu/" title="Information about the Ohio Digital Resource Commons">Ohio Digital Resource Commons</a>.  Although there is nothing specific to <a href="http://www.jboss.com/products/seam" title="JBoss Seam homepage">JBoss Seam (a Java Enterprise Edition application framework)</a> in these concepts, making an object &#8220;render itself&#8221; does make the Seam-based interface application easier to code and understand.  A disseminator-centric architecture also allows us to put our code investment where it matters the most &mdash; in the repository framework &mdash; and exploit that investment in many places.  So what does it mean to have a disseminator-centric architecture and have objects &#8220;render themselves&#8221;?</p><p><h2>How It Works</h2><br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/sequence.png" title="Sequence Diagram"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/sequence.png" alt="Sequence Diagram " style="float:right;width:50%;margin-left:1.5em;" /></a>This is a sequence diagram showing all of the pieces:</p><ul><li>Browser:  The user&#8217;s browser</li><li>DRCseam:  A JBoss Seam application that generates the user interface and performs much of the business logic.  DRCseam, however, does <strong>not</strong> render the objects or their metadata into browser-consumable artifacts.  Read on!</li><li>Fedora:  A basic Fedora digital object repository.</li><li>Disseminator:  A simple servlet that performs various transformations on object datastreams to render content usable by the browser.</li></ul><p>With these components in play, here is the description of a sequence to render a page showing the metadata for a repository item:</p><ol><li><span style="font-weight:bolder;font-style:italic">request item page</span>: The browser follows a link to an item detail page.</li><li><span style="font-weight:bolder;font-style:italic">API-A ObjectProfile</span>: The interface application asks the repository for the &#8216;Object Profile&#8217; of the item&#8230;</li><li><span style="font-weight:bolder;font-style:italic">return object profile</span>: &#8230;which the repository returns.  The interface application now knows basic details about the object:  that it exists, the creation and updated timestamps, and so forth.</li><li><span style="font-weight:bolder;font-style:italic">API-A DatastreamDissemination for fullDisplay</span>: The interface application needs the object&#8217;s metadata display, so it asks the object to &#8220;render itself&#8221; by making a call to the Fedora repository for the object&#8217;s &#8220;FullDisplay&#8221; disseminator.</li><li><span style="font-weight:bolder;font-style:italic">call getFullDisplay</span>: The Fedora repository in turn calls the object&#8217;s disseminator with the Persistent Identifier (PID) of the object as a parameter.</li><li><span style="font-weight:bolder;font-style:italic">API-A Datastream for metadata</span>: Using the object PID, the disseminator calls back to the Fedora repository for the descriptive metadata datastream (the DC datastream, in this case)&#8230;</li><li><span style="font-weight:bolder;font-style:italic">XML metadata</span>: &#8230;which the Fedora repository returns.</li><li><span style="font-weight:bolder;font-style:italic">transform metadata</span>: The disseminator performs some transformation or derivation on the  descriptive datastream to create an XHTML representation&#8230;</li><li><span style="font-weight:bolder;font-style:italic">XHTML fragment</span>: &#8230;which it returns to the Fedora software&#8230;</li><li><span style="font-weight:bolder;font-style:italic">XHTML fragment</span>: &#8230;which is returned to the interface application&#8230;</li><li><span style="font-weight:bolder;font-style:italic">XHTML page</span>: &#8230;which inserts it at the appropriate place in the XHTML page it has built and returns the XHTML page to the browser.</li></ol><p>Step #4 is where we diverge from previous architectures.  Instead of making the interface application transform the metadata into a human-readable format, the interface application calls the object&#8217;s disseminator to do the job.</p><p><h3>The Heart of It All:  The Disseminator</h3><br />The key to this architecture is <em>asking the object to &#8220;render itself&#8221;</em>.  This puts the task of creating the appropriate representation at the object level.  The object can be an image, a video, a spreadsheet, or a PDF file.  More importantly, the object can be a PDF of a journal article or a PDF of a thesis; in both cases the metadata describing that PDF file will be different (journal/volume/issue in one case and department/degree/advisor in the other).</p><p>Rather than putting special case code in the interface application to render the description of the journal article one way and the thesis another way, that special case code is bound to the object in the form of a &#8220;disseminator&#8221;.  The disseminator methods for the journal article and the thesis share the same name &mdash; <code>getFullDisplay</code> &mdash; but will return entirely different XHTML fragments &mdash; one for a journal article and one for a thesis.  For both objects, though, the interface application will make a call to the object in the Fedora repository asking for the output of each <code>getFullDisplay</code> dissemination.  In the case of a Dublin Core description, the dissemination output could look like this:</p><div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">&amp;lt;table class=&quot;drc_dublinCore_table&quot;&amp;gt;
&amp;lt;tr class=&quot;drc_dublinCore_row drc_dublinCore_title&quot;&amp;gt;
&amp;lt;td class=&quot;drc_dublinCore_label drc_dublinCore_title&quot;&amp;gt;Title:&amp;lt;/td&amp;gt;
&amp;lt;td class=&quot;drc_dublinCore_value drc_dublinCore_title&quot;&amp;gt;Jester Example&amp;lt;/td&amp;gt;
&amp;lt;/tr&amp;gt;
&amp;lt;tr class=&quot;drc_dublinCore_row drc_dublinCore_identifier&quot;&amp;gt;
&amp;lt;td class=&quot;drc_dublinCore_label drc_dublinCore_identifier&quot;&amp;gt;Identifier:&amp;lt;/td&amp;gt;
&amp;lt;td class=&quot;drc_dublinCore_value drc_dublinCore_identifier&quot;&amp;gt;demo:exampleObject&amp;lt;/td&amp;gt;
&amp;lt;/tr&amp;gt;
&amp;lt;/table&amp;gt;</pre></div></div><p>You&#8217;ll note that there is a liberal application of CSS styles on all of the XHTML elements, allowing for the look of the dissemination to be further transformed in the browser via CSS stylesheets.  A <code>getFullDisplay</code> dissemination for a journal article could look like this:</p><div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">&amp;lt;table class=&quot;drc_ejc_table&quot;&amp;gt;
&amp;lt;tr class=&quot;drc_ejc_row drc_ejc_title&quot;&amp;gt;
&amp;lt;td class=&quot;drc_ejc_label drc_ejc_title&quot;&amp;gt;Article Title:&amp;lt;/td&amp;gt;
&amp;lt;td class=&quot;drc_ejc_value drc_ejc_title&quot;&amp;gt;Taking Advantage of Fedora Disseminations&amp;lt;/td&amp;gt;
&amp;lt;/tr&amp;gt;
&amp;lt;tr class=&quot;drc_ejc_row drc_ejc_volume&quot;&amp;gt;
&amp;lt;td class=&quot;drc_ejc_label drc_ejc_volume&quot;&amp;gt;Volume:&amp;lt;/td&amp;gt;
&amp;lt;td class=&quot;drc_ejc_value drc_ejc_volume&quot;&amp;gt;3&amp;lt;/td&amp;gt;
&amp;lt;/tr&amp;gt;
&amp;lt;tr class=&quot;drc_ejc_row drc_ejc_issue&quot;&amp;gt;
&amp;lt;td class=&quot;drc_ejc_label drc_ejc_issue&quot;&amp;gt;Issue:&amp;lt;/td&amp;gt;
&amp;lt;td class=&quot;drc_ejc_value drc_ejc_issue&quot;&amp;gt;2&amp;lt;/td&amp;gt;
&amp;lt;/tr&amp;gt;
&amp;lt;/table&amp;gt;</pre></div></div><p><h3>Looking at the Pieces</h3><br />There is a demonstration system set up for a short period of time that shows all of the pieces.  First, the disseminator:</p><ul><li><span class="removed_link" title="http://drc-dev.ohiolink.edu:8080/BaseDisseminator/getFullDisplay/demo:exampleObject">http://drc-dev.ohiolink.edu:8080/BaseDisseminator/getFullDisplay/demo:exampleObject</span></li></ul><p>Next, how this disseminator looks as accessed through the Fedora repository:</p><ul><li><span class="removed_link" title="http://drc-dev.ohiolink.edu:8080/fedora/get/demo:exampleObject/demo:bDefExample/getFullDisplay/">http://drc-dev.ohiolink.edu:8080/fedora/get/demo:exampleObject/demo:bDefExample/getFullDisplay/</span></li></ul><p>And finally, how this result looks through the Seam-based interface application.  (A note about this application &mdash; only this URL works at the moment even though there are other links on the page.  This is also the &#8216;trunk&#8217; version of our interface code, so it is likely to change and/or break and/or work better at any time.)</p><ul><li><span class="removed_link" title="http://drc-dev.ohiolink.edu:8080/drc/item.seam?itemId=demo%3AexampleObject">http://drc-dev.ohiolink.edu:8080/drc/item.seam?itemId=demo%3AexampleObject</span></li></ul><p><h2>Fedora Setup</h2><br />In addition to the Seam-based interface application and the disseminator code, there is setup required at the Fedora repository &mdash; specifically, the creation of a Behavior Definition (bDef) that describes the disseminators that the objects share in common and the creation of a Behavior Mechanism (bMech) that describes the implementation of that definition for a particular object type.  Below is a series of screen shots that show the steps to create the bDef and bMech.</p><p><h3>Disseminator Behavior Definition (bDef)</h3><br />Using the Fedora Admin client, under the &#8220;Builders&#8221; menu, select &#8220;Behavior Definition Builder&#8221;.  The first pane, &#8220;General&#8221; parameters, use a specific PID of &#8216;<code>demo:bDefExample</code>&#8216; and put something in for the Behavior Object Name, Behavior Object Description, and one of the Dublin Core Metadata fields.  (It doesn&#8217;t matter what you put in for these values.)<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bdbgeneral.png" title="Fedora Admin Behavior Definition Builder “General” pane"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bdbgeneral.png" alt="Fedora Admin Behavior Definition Builder “General” pane" style="width:85%;" /></a></p><p>Under the &#8220;Abstract Methods&#8221; pane, create new definitions for each of the disseminator methods.<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bdbabstractmethods.png" title="Fedora Admin Behavior Definition Builder “Abstract Methods” pane"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bdbabstractmethods.png" alt="Fedora Admin Behavior Definition Builder “Abstract Methods” pane" style="width:85%;" /></a></p><p>Under the &#8220;Documentation&#8221; pane, put something in the first entry.  Again, it doesn&#8217;t matter what is put in for these values, but they are required.<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bdbdocumentation.png" title="Fedora Admin Behavior Definition Builder “Documentation” pane"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bdbdocumentation.png" alt="Fedora Admin Behavior Definition Builder “Documentation” pane" style="width:85%;" /></a></p><p>Select &#8220;Ingest&#8221; at the bottom of the window, and the <code>demo:bDefExample</code> bDef will be created.  Alternatively, you could import the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/BaseDisseminator/trunk/resources/foxml/demo_bDefExample.xml?rev=774"><code>demo:bDefExample</code> saved in the DRC source code repository</span> (choose &#8220;original format&#8221; at the bottom of that page).</p><p><h3>Disseminator Mechanism Definition (bMech)</h3><br />The bMech is a little more complicated.  Under the &#8220;Builders&#8221; menu, select &#8220;Behavior Mechanism Builder&#8221;.  The first pane, &#8220;General&#8221; parameters, use a specific PID of &#8216;<code>demo:bMechExample</code>&#8216; and put something in for the Behavior Object Name, Behavior Object Description, and one of the Dublin Core Metadata fields.  (It doesn&#8217;t matter what you put in for these values.)  In the &#8220;Behavior Definition Contract&#8221; pick the bDef just created (<code>demo:bDefExample</code>).<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbgeneral.png" title="Fedora Admin Behavior Mechanism Builder “General” pane"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbgeneral.png" alt="Fedora Admin Behavior Mechanism Builder “General” pane" style="width:85%;" /></a></p><p>In the &#8220;Service Profile&#8221; pane, put in values in the &#8220;General&#8221; area (it doesn&#8217;t matter what).  In the Service Binding area, make sure the Message Protocol is <code>HTTP GET</code>, put in <code>text/html, text/xml</code> for Input MIME Types and put in <code>text/html, text/xml, text/plain</code> for Output MIME Types.<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbserviceprofile.png" title="Fedora Admin Behavior Mechanism Builder “Service Profile” pane"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbserviceprofile.png" alt="Fedora Admin Behavior Mechanism Builder “Service Profile” pane" style="width:85%;" /></a></p><p>Under the Service Methods pane, put in <code>http://localhost:8080/BaseDisseminator</code> for the Base URL.  (The disseminator is also loaded in the same servlet as the Fedora repository and the Seam interface application, and it is loaded at the &#8220;/BaseDisseminator&#8221; context path in the servlet.)  Create Service Method Definitions that correspond to the Abstract Methods in the bDef.<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbservicemethods.png" title="Fedora Admin Behavior Mechanism Builder “Service Methods” pane"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbservicemethods.png" alt="Fedora Admin Behavior Mechanism Builder “Service Methods” pane" style="width:85%;" /></a></p><p>Select &#8220;Properties&#8221; for each one of the Service Method Definitions in turn.  &#8220;echo&#8221; is a unique disseminator method that simply echos back the context parameters of the disseminator request.  This is useful for seeing exactly what the Fedora server is going to give to the disseminator.<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbservicemethods-echo.png" title="Fedora Admin Behavior Mechanism Builder “Service Methods” Definitions for “echo” Method"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbservicemethods-echo.png" alt="Fedora Admin Behavior Mechanism Builder “Service Methods” Definitions for “echo” Method" style="width:85%;" /></a></p><p>With the exception of &#8220;echo&#8221; all of the other Service Method Definitions are the same.  The Method Binding consists of the disseminator method followed by a slash and the PID placeholder followed by a question mark and &#8216;dc&#8217; equals the DC placeholder.  Since the Method Binding field has two placeholders, there are two entries in the Method Parameter Definitions area.  The first is for PID &mdash; a &#8220;Default&#8221; parameter that is required and passed by value to the disseminator.  The default value is the special value <code>$PID</code>, which the repository software will replace with the PID of the object as the disseminator is called.  The second is for DC, a &#8220;Datastream&#8221; parameter that is required and passed to the disseminator by URL reference.  The disseminator doesn&#8217;t actually use this reference to a datastream, but it is a requirement that all bMechs pass a datastream of one sort or another to the disseminator.<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbservicemethods-getfulldisplay.png" title="Fedora Admin Behavior Mechanism Builder “Service Methods” Definitions for “getFullDisplay” Method"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbservicemethods-getfulldisplay.png" alt="Fedora Admin Behavior Mechanism Builder “Service Methods” Definitions for “getFullDisplay” Method" style="width:85%;" /></a></p><p>If you have followed all of the steps so far, under the &#8220;Datastream Input&#8221; pane there will be one entry for DC in the table.  The only thing that needs to be done here is adding &#8220;text/xml&#8221; in the MIMEType column.<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbdatastreaminput.png" title="Fedora Admin Behavior Mechanism Builder “Datastream Input” pane"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbdatastreaminput.png" alt="Fedora Admin Behavior Mechanism Builder “Datastream Input” pane" style="width:90%;" /></a></p><p>Under the &#8220;Documentation&#8221; pane, put something in the first entry.  Again, it doesn&#8217;t matter what is put in for these values, but they are required.<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbdocumentation.png" title="Fedora Admin Behavior Mechanism Builder “Documentation” pane"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/bmbdocumentation.png" alt="Fedora Admin Behavior Mechanism Builder “Documentation” pane" style="width:85%;" /></a></p><p>Select &#8220;Ingest&#8221; at the bottom of the window, and the <code>demo:bMechExample</code> bMech will be created.  Alternatively, you could import the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/BaseDisseminator/trunk/resources/foxml/demo_bMechExample.xml?rev=774"><code>demo:bMechExample</code> saved in the DRC source code repository</span> (choose &#8220;original format&#8221; at the bottom of that page).</p><p><h3>Sample Object</h3><br />The last step is to add this disseminator bDef/bMech combination to an object.  Edit any object in the repository and go to the &#8220;Disseminators&#8221; pane.  If there are other disseminators already defined for this object, select &#8220;New&#8221; along the left side.  Put in a label &mdash; any label will do.  Next to &#8220;Behavior defined by&#8230;&#8221; select <code>demo:bDefExample</code>.  Then next to &#8220;Mechanism&#8221; select <code>demo:bMechExample</code>.  The admin client will prompt for a DC binding; select &#8220;Add&#8221; and choose the DC datastream in the pop-up window.<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/objectdisseminatorsdatastream.png" title="Fedora Admin Sample Object’s “Disseminators” pane in progress"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/objectdisseminatorsdatastream.png" alt="Fedora Admin Sample Object’s “Disseminators” pane in progress" style="width:85%;" /></a></p><p>Select &#8220;Save Changes&#8221; at the bottom.  The completed disseminator looks like this:<br /><a href="http://cdn.dltj.org/wp-content/uploads/2007/04/objectdisseminators.png" title="Fedora Admin Sample Object’s “Disseminators” pane completed"><img src="http://cdn.dltj.org/wp-content/uploads/2007/04/objectdisseminators.png" alt="Fedora Admin Sample Object’s “Disseminators” pane completed" style="width:85%;" /></a></p><p>There is <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/BaseDisseminator/trunk/resources/foxml/demo_exampleObject.xml?rev=774">a sample object in the DRC source code repository</span> that has the disseminator already defined.</p><p><h2>Notes</h2><br />Comments about this architecture are certainly welcome.  I&#8217;m sure I&#8217;ll be writing about it more in the future, but here are some thoughts at this point:</p><p><h3>Future Directions</h3><br />In this case, I&#8217;m using an XSLT stylesheet to transform the Dublin Core XML into an XHTML table.  That stylesheet is stored in the BaseDisseminator WAR file.  The stylesheet could just as easily be a datastream of a special &#8220;formatting&#8221; object in the repository.  One of the key distinctions of OhioLINK&#8217;s Fedora implementation is that institutions using the repository will be able to &#8220;brand&#8221; their content in any way they choose.  Having the flexibility of storing metadata transformations just like any other object in the repository would seem to be of great advantage in that scenario.</p><p>On a related front, this style of implementation would be greatly enhanced by the work of the Fedora <a href="http://www.cs.cornell.edu/payette/fedora/designs/cmda/" title="CMDA Proposal">Content Model Dissemination Architecture</a> (CMDA).  Because disseminators must be bound to specific objects rather than classes of objects, management of the variety of bMechs in a scenario such as this will likely become difficult very soon.  I&#8217;m heartened by the fact that the CMDA work is going on and will cut our management complexity dramatically when it becomes available.</p><p><h3>Acknowlegements</h3><br />These concepts are based in part on the work of the Digital Library Federation&#8217;s <a href="https://wiki.dlib.indiana.edu/display/DLFAquifer/Asset+Action+Project" title="Aquifer Digital Collections Asset Actions homepage">Aquifer Asset Actions</a> technical working group and discussions among members of the OAI <a href="http://www.openarchives.org/ore/" title="Open Archives Initiative Protocol - Object Exchange and Reuse homepage">Object Reuse and Exchange</a> technical committee as well as conversations with many Fedora developers and implementors.  Thanks, everyone.</p><p>[Update 20070426T1147 : Fixed the sample object URL.  Thanks, Jodi.]<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/BaseDisseminator/trunk/resources/foxml/demo_exampleObject.xml?rev=774 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/BaseDisseminator/trunk/resources/foxml/demo_bMechExample.xml?rev=774 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/BaseDisseminator/trunk/resources/foxml/demo_bDefExample.xml?rev=774 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu:8080/drc/item.seam?itemId=demo%3AexampleObject on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu:8080/fedora/get/demo:exampleObject/demo:bDefExample/getFullDisplay/ on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu:8080/BaseDisseminator/getFullDisplay/demo:exampleObject on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://rama.grainger.uiuc.edu/assetActions/ to https://wiki.dlib.indiana.edu/display/DLFAquifer/Asset+Action+Project on January 19th, 2011.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/disseminator-centric-repository/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Building an Institutional Repository Interface Using EJB3 and JBoss Seam</title><link>http://dltj.org/article/drc-ir-ejb3-seam/</link> <comments>http://dltj.org/article/drc-ir-ejb3-seam/#comments</comments> <pubDate>Fri, 19 Jan 2007 04:00:49 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[DRC]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[ejb3]]></category> <category><![CDATA[icor2007]]></category> <category><![CDATA[java]]></category> <category><![CDATA[programming]]></category> <category><![CDATA[seam]]></category><guid isPermaLink="false">http://dltj.org/2007/01/drc-ir-ejb3-seam/</guid> <description><![CDATA[This tour is designed to show the overall architecture of a FEDORA digital object repository application within the JBoss Seam framework while at the same time pointing out individual design decisions and extension points that are specific to the Ohio &#8230; <a href="http://dltj.org/article/drc-ir-ejb3-seam/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/01/drc-ir-ejb3-seam/"></abbr><p>This tour is designed to show the overall architecture of a <a href="http://www.fedora.info/" title="Fedora">FEDORA digital object repository</a> application within the <a href="http://www.jboss.com/products/seam" title="JBoss.com - JBoss Seam">JBoss Seam framework</a> while at the same time pointing out individual design decisions and extension points that are specific to the <a href="http://drc-dev.ohiolink.edu/" title="Digital Resource Commons - Trac">Ohio Digital Resource Commons application</a>. Geared towards software developers, a familiarity with <a href="http://java.sun.com/products/servlet/" title="Java Servlet Technology">Java Servlet programming</a> is assumed, although not required.  Knowledge of JBoss Seam, <a href="http://www.hibernate.org/" title="hibernate.org - Hibernate">Hibernate</a>/<a href="http://java.sun.com/javaee/overview/faq/persistence.jsp" title="Java Persistence API FAQ">Java Persistence API</a>, <a href="http://jcp.org/en/jsr/detail?id=220" title="The Java Community Process(SM) Program - JSRs: Java Specification Requests - detail JSR# 220">EJB3</a> and <a href="http://java.sun.com/javaee/" title="Java EE at a Glance">Java EE</a> would be helpful but not required; brief explanations of core concepts of these technologies are included in this tour.</p><p>The tour is based on <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk?rev=709">revision 709 of /drc/trunk</span> and was last updated on 18-Jan-2007.</p><p>This tour will also be incorporated into a <span class="removed_link" title="http://openrepositories.org/program/fedora#session4">presentation at Open Repositories 2007 on Tuesday afternoon</span>.</p><p><h2 id="DirectoryLayout">Directory Layout</h2></p><p>The source directory tree has four major components:  &#8216;lib&#8217;, &#8216;resources&#8217;, &#8216;src&#8217;, and &#8216;view&#8217;.</p><p><strong>lib &#8211; libraries required by the application.</strong> The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/lib?rev=709">lib directory</span> contains all of the JAR libraries required by the application. Its contents is a mix of the Seam-generated skeleton (pretty much everything at the top level of the &#8216;lib&#8217; directory) and JAR libraries that are specific to the DRC application (in subdirectories of &#8216;lib&#8217; named for the library in use).  For instance, the &#8216;commons-codec-1.3&#8242; and the &#8216;hibernate-all&#8217; and the &#8216;jboss-seam&#8217; JAR files were all brought into the project via &#8216;seam-gen&#8217; while &#8216;lib/commons-net-1.4.1/commons-net-1.4.1.jar&#8217; library was added specifically for this project. A convention has been established whereby new libraries added to the project appear as entries in the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/lib/lib.properties?rev=709">lib.properties file</span> which is used by <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/build.xml?rev=709#L65">series of directives in the build.xml file</span> to setup the classpaths <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/build.xml?rev=709#L101">for compiling</span> and <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/build.xml?rev=709#L116">for building the EJB JAR</span>. This is done to make the testing and transition of new libraries into the application more explicit and easily testable. Note that the newly included library directory also includes a copy of any license file associated with that library; this is not only a requirement to use some libraries but is also a good practice to show the lineage of some of the lesser known libraries. (For an example of what is required, see the changes <span class="removed_link" title="http://drc-dev.ohiolink.edu/changeset/699%23file2">to build.xml</span> and <span class="removed_link" title="http://drc-dev.ohiolink.edu/changeset/699%23file3">to lib.properties</span> in order to bring the Apache Commons Net library into the application.)</p><p><strong>resources &#8211; configuration files and miscellaneous stuff.</strong> The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/resources?rev=709">resources directory</span> holds the various configuration files required by the application plus other files used for testing and demonstration.  Much of this was generated by the Seam-generated skeleton as well.  Some key files here are the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/resources/import.sql?rev=709">import.sql file</span> (SQL statements that are used to preload the RDBMS used by Hibernate as the mocked up repository system) and the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/resources/test-datastreams?rev=709">test-datastreams directory</span> which has sample files for each of the media types.</p><p><strong>src &#8211; Java source code.</strong> The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src?rev=709">src directory</span> contains all of the Java source code for the application.  Everything exists in a package called &#8216;edu.ohiolink.drc&#8217; with subpackages for <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/action?rev=709">classes handling actions</span> from the view component of the MVC, <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/entity?rev=709">entity beans</span> (sometimes known as Data Access Objects &#8212; or DAOs &#8212; I think), <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/exceptions?rev=709">exception classes</span> (more on this below), <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/fedora?rev=709">classes for working with FEDORA</span> (not currently used), <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler?rev=709">media type handler classes</span> (more on this below), <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/test?rev=709">unit test classes</span> (not currently used), and <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/utility?rev=709">utility classes</span>.</p><p><strong>view &#8211; XHTML templates, CSS files, and other web interface needs.</strong> The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/view?rev=709">view directory</span> holds all of the files for the &#8220;view&#8221; aspect of the Model-View-Controller paradigm. More information about the view components is below.</p><p><h2 id="EntityClasses">Entity Classes</h2></p><p>The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/entity?rev=709">entity beans package</span> has three primary entity beans defined: <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/entity/Item.java?rev=709">Item.java</span>, <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/entity/Datastream.java?rev=709">Datastream.java</span>, and <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/entity/Description.java?rev=709">Description.java</span>. (The FedoraServer.java entity bean is not used at this time.)  Item.java is the primary bean that represents an object in the repository.  Datastream.java and Description.java are component beans that only exist in the lifecycle of an Item.java bean; Datastream.java holds a representation of a FEDORA object datastream and Description.java holds a representation of a Dublin Core datastream for that object.</p><p>The Datastream and Description objects are annotated with <tt>@Embedded</tt> in the Item.java source; this is Hibernate&#8217;s way of saying that these objects do not stand on their own.  Item.java also has numerous methods marked with a <tt>@javax.persistence.Transient</tt> annotation meaning that this is information not stored in the backing Hibernate database; these methods are for the various content handlers, which will be outlined below.</p><p><h2 id="MockRepository">Mock Repository</h2></p><p>As currently configured, the entity beans pull their information from a static RDBMS using Hibernate rather than from an underlying FEDORA digital object repository. (You&#8217;ll need to go back to <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk?rev=691">revision 691</span> to see how far we got with the FEDORA integration into JBoss Seam before we switched our development focus to the presentation &#8216;view&#8217; aspects of the application.) As <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/resources/drc-ds.xml?rev=709">currently configured</span>, Hibernate uses an embedded Hypersonic SQL database for its datastore.  As part of the application deploy process, the Java EE container will instantiate a Hypersonic database and preload it with the contents of the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/resources/import.sql?rev=709">import.sql file</span>. (The import.sql file contains just three sample records at the moment:  one each for a text file, a PDF file, and a graphic file.)</p><p>All of the data for a repository object is contained in a single table record. Hibernate manages the process for us of reading that record out of the database and creating the three corresponding Java objects:  Item, Datastream and Description. (Hibernate could also handle the process of updating the underlying table record if we were to change a value in one of the Java objects.) The mapping of table column to Java object field is handled by the <tt>@Column(name="xx")</tt> annotations in the entity beans.</p><p>For Datastream, what is stored in the database is not the datastream content itself but rather a filename that points to the location of the datastream file.  The file path in this field can either be absolute (meaning a complete path starting from the root directory of the filesystem) or a relative path.  In the case of the latter, the path is relative to the deployed application&#8217;s WAR directory (something like &#8220;&#8230;/jboss-4.0.5.GA/server/default/deploy/drc.ear/drc.war/&#8221; for instance). Note that the getter/setter methods for the contentLocation are <tt>private</tt> &#8212; the rest of the application does not need to know the location of the datastreams; this will also be true when the DRC application is connected to a FEDORA digital object repository. The method marked <tt>public</tt> instead is getContent, and the implementation of getContent hides the complexity of the fact that the datastream is coming from a disk file rather than a FEDORA repository call. For the three records/repository-objects currently defined in &#8216;import.sql&#8217; there are three corresponding demo datastreams in the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/resources/test-datastreams?rev=709">test-datastreams directory</span>.</p><p>In all likelihood, this representation of the FEDORA repository will be too simple for us to move forward much further. In particular, the current notion of one datastream per repository object is too simplistic. The Datastream embedded object will likely need to be broken out into a separate table and as a corresponding distinct Java applet. (We may reach the same point soon for the Description object as well.)</p><p>By using the Entity Beans as a buffer between the business logic and the view components of the rest of the application, I hope we can minimize/localize the changes required in the future in order to replace the mock repository with a real underlying FEDORA repository.</p><p><h2 id="ViewTemplates">View Templates</h2></p><p>The preferred view technology for JBoss Seam is <a class="ext-link" href="http://facelets.dev.java.net/" title="301 Moved Permanently"><span class="icon">Facelets</span></a>, an implementation of Java Server Faces that does not require the use of Java Server Pages (JSP). Although the &#8216;.xhtml&#8217; pages in the view directory bear a passing resemblance to JSP, behind the scenes they are radically different.  Of note for us is the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/view/layout?rev=709">clean templating system</span> used to generate pages. The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/view/home.xhtml?rev=709">home.xhtml file</span> has a reference to the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/view/layout/template.xhtml?rev=709">template.xhtml file in the &#8216;layout&#8217; directory</span>. If you read through the template.xhtml file, you can see where the Facelets engine will pull in other .xhtml files in addition to the content within the <tt>&lt;ui:define name="body"&gt;</tt> tag of home.xhtml.</p><p><h2 id="ContentHandlers">Content Handlers</h2></p><p>The paradigm of handling different media types within the DRC application is guided in large part by the notion of <a class="ext-link" href="http://dltj.org/2006/05/fedora-disseminators/"><span class="icon">disseminators for FEDORA objects</span></a> and the <a class="ext-link" href="https://wiki.dlib.indiana.edu/display/DLFAquifer/Asset+Action+Project" title="Aquifer Digital Collections"><span class="icon">Digital Library Federation Aquifer Asset Actions experiments</span></a>.  The underlying concept is to push the media-specific content handling into the digital object repository and to have the presentation interface consume those content handlers as it is preparing the end-user presentation.</p><p>For instance, the DRC will need to handle content models for PDFs, images, video, and so forth.  Furthermore, how a video datastream from the Digital Video Collection is offered to the user may be different than how a video datastream from a thesis is offered to the user. Rather than embedding the complexity of making those interface decisions into the front-end DRC application, this model of content handlers pushes that complexity closer to the objects themselves by encoding those behaviors a disseminators of the object.  What the presentation layer gets from the object is a chunk of XHTML that it inserts into the dynamically generated HTML page at the right place.</p><p>There is work beginning on a framework for FEDORA disseminators at <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/BaseDisseminator/trunk?rev=691">/BaseDisseminator/trunk</span> in the source code repository; that work has been put on hold at the moment in favor of focusing on the presentation interface.  In order to prepare for the time when the presentation behaviors are encoded as FEDORA object disseminators, the current presentation layer makes use of Content Handlers for each of the media types.  The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/Handler.java?rev=709">Handler interface</span> defines the methods required by each handler and the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709">TextHandler class</span>, the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/ImageHandler.java?rev=709">ImageHandler class</span>, and the <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/PdfHandler.java?rev=709">PdfHandler class</span> implement the methods for the three media types already defined.</p><p>Of these, <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709">TextHandler class</span> is the most complete, so I&#8217;ll use it as an example.</p><ul><li>The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709#L63">getRawDatastream method</span> takes the datastream and sends it back to the browser with the HTTP headers that cause a File-Save dialog box to open.</li><li>The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709#L92">getFullDisplay method</span> returns a chunk of XHTML that presents the full metadata in a manner that can be included in a full metadata display screen.</li><li>The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709#L132">getRecordDisplay method</span> (currently unwritten) returns a chuck of XHTML used to represent the object in a list of records that resulted from a user&#8217;s search or browse request.</li><li>The <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709#L140">getThumbnail method</span> (currently unwritten) returns a static graphic thumbnail rendition of the datastream (e.g. a cover page, a key video frame, etc.).</li></ul><p>By making these content handlers distinct classes, it is anticipated that the rendering code for each of these methods can be more easily moved to FEDORA object disseminators with minimal impact to the surrounding DRC interface application.</p><p><h2 id="ExceptionHandling">Exception Handling</h2></p><p>The DRC application follows the practice suggested by Barry Ruzek in <a href="http://www.oracle.com/technetwork/articles/entarch/effective-exceptions-092345.html" title="Effective Java Exceptions">Effective Java Exceptions</a> (found via <a href="http://www.theserverside.com/news/thread.tss?thread_id=43820" title="http://www.theserverside.com/news/thread.tss?thread_id=43820">this link on The Server Side</a>).  The article can be summarized as:</p><blockquote><p>One type of exception is a <strong>contingency</strong>, which means that a process was executed that cannot succeed because of a known problem (the example he uses is that of a checking account, where the account has insufficient funds, or a check has a stop payment issued.) These problems should be handled by way of a distinct mechanism, and the code should expect to manage them.</p><p>The other type of exception is a <strong>fault</strong>, such as the IOException. A fault is typically not something that is or should be expected, and therefore handling faults should probably not be part of a normal process.</p><p>With these two classes of exception in mind, it&#8217;s easy to see what should be checked and should be unchecked: the <strong>contingencies should be checked</strong> (and descend from Exception) and the faults should be unchecked (and descend from Error).</p></blockquote><p>All unchecked exceptions generated by the application are subclasses of <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/exceptions/DrcBaseAppException.java?rev=709">DrcBaseAppException</span>. (<tt>DrcBaseApplication</tt> itself is a subclass of <tt>RuntimeException</tt>.)  For an example, see <span class="removed_link" title="http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/exceptions/NoHandlerException.java?rev=709">NoHandlerException</span>.  By setting up all of the applications exceptions to derive from this point, we have one place where logging of troubleshooting information can take place (although this part of the application has not been set up yet). Except when there is good reason to do otherwise, this pattern should be maintained.</p><p>At this point, no checked (or <em>contingency</em>) exceptions specific to the DRC have been defined.  When they are needed, though, they will follow the same basic structure with a base exception derived from <tt>Exception</tt>.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/lib?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/build.xml?rev=709#L65 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/resources?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/changeset/699%23file3 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/changeset/699%23file2 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/build.xml?rev=709#L116 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/build.xml?rev=709#L101 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/lib/lib.properties?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/view/layout?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/resources/drc-ds.xml?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk?rev=691 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/entity/Description.java?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/entity/Datastream.java?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/entity/Item.java?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/view?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/utility?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/test?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/exceptions/DrcBaseAppException.java?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709#L140 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709#L132 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709#L92 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709#L63 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/PdfHandler.java?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/ImageHandler.java?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/TextHandler.java?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/BaseDisseminator/trunk?rev=691 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/exceptions/NoHandlerException.java?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/resources/import.sql?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/resources/import.sql?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/view/layout/template.xhtml?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/fedora?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/exceptions?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/entity?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/action?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/resources/test-datastreams?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/src/edu/ohiolink/drc/handler/Handler.java?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/browser/drc/trunk/view/home.xhtml?rev=709 on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://openrepositories.org/program/fedora#session4 on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://rama.grainger.uiuc.edu/assetActions/ to https://wiki.dlib.indiana.edu/display/DLFAquifer/Asset+Action+Project on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://dev2dev.bea.com/pub/a/2006/11/effective-exceptions.html to http://www.oracle.com/technetwork/articles/entarch/effective-exceptions-092345.html on January 20th, 2011.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/drc-ir-ejb3-seam/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Looking Forward to Version 2.2 of FEDORA</title><link>http://dltj.org/article/fedora-2-point-2/</link> <comments>http://dltj.org/article/fedora-2-point-2/#comments</comments> <pubDate>Mon, 01 Jan 2007 01:46:39 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[DRC]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[icor2007]]></category> <category><![CDATA[java]]></category> <category><![CDATA[jboss]]></category> <category><![CDATA[library service-oriented architecture]]></category> <category><![CDATA[open source]]></category><guid isPermaLink="false">http://dltj.org/2006/12/fedora-2-point-2/</guid> <description><![CDATA[Sandy Payette, Co-Director of the Fedora Project and Researcher in the Cornell Information Science department, announced a tentative date for the release 2.2 of the FEDORA digital object repository.The Fedora development team would like to announce that Fedora 2.2 will &#8230; <a href="http://dltj.org/article/fedora-2-point-2/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2006/12/fedora-2-point-2/"></abbr><p><a href="http://www.cs.cornell.edu/payette" title="Sandy Payette&#039;s homepage">Sandy Payette</a>, Co-Director of the Fedora Project and Researcher in the Cornell Information Science department, <a href="http://article.gmane.org/gmane.comp.cms.fedora-commons.user/2330/" title="Posting to the Fedora-Users mailing list by Sandy Payette with the subject &#039;Release Date for Fedora 2.2&#039; dated Fri, Dec 22, 2006 at 15:25:56 EST"> announced a tentative date for the release 2.2 of the FEDORA</a> digital object repository.</p><blockquote><p>The Fedora development team would like to announce that Fedora 2.2 will be released on Friday, January 19, 2007.</p><p>This new release will contain many significant new features and enhancements, including <i>[numbers added to the original for the sake of subsequent commentary]</i>:</p><ol><li>Fedora repository is now a web application (.war) that can be installed in any container</li><li>Fedora authentication has been refactored to use servlet filters (no longer Tomcat realms)</li><li>A new Fedora installer makes it easy to get started with Fedora (with both &#8220;quick&#8221; and &#8220;custom&#8221; install options)</li><li>GSearch service (backed by Lucene or Zebra) &#8211; flexible, configurable, indexes any datastream</li><li>Journaling service to create a backup/mirror repository</li><li>New checksum features for datastreams</li><li>Support for Postgres database configuration</li><li>Standard system logging with Log4J</li><li>Over 40 bug fixes</li><li>Many other enhancements</li></ol><p>Be on the lookout for the release announcement the new year!   Also, there will be opportunities to talk with the Fedora development team at Open Repositories 2007 (<a href="http://openrepositories.org/" title="Open Repositories 2007 homepage">http://openrepositories.org/</a>).</p></blockquote><p>This is great news and a major step forward for the project.  Here are some reasons why I think this is true.</p><p><h2>1. Fedora repository is now a web application (.war)</h2><br />To this point, the FEDORA repository application distribution has been pre-bundled inside a Tomcat Java servlet container.  The binding has been pretty tight with certain dependencies written into the Tomcat configuration itself.  That made it very difficult to install FEDORA into an organization&#8217;s existing servlet container (be it another installation of Tomcat or Jetty/JBoss/Glassfish, etc.).  Even more problematic, there were reports of problems trying to get JSP-based applications to work inside the FEDORA-supplied container (we ran into this ourselves) meaning that organizations wanting to run both FEDORA and another servlet-based application needed to run <i>two</i> servlet containers; pretty inefficient.  (OhioLINK was in this position in its early implementations of the <a href="http://info.drc.ohiolink.edu/" title="Ohio Digital Resource Commons home page">Ohio DRC</a> project.)</p><p>With release 2.2, the core developers have effectively turned the software distribution inside out.  The primary output of the new build process is a standard <b>W</b>eb <b>AR</b>chive (or <b>WAR</b>) file that can be put inside any servlet container.  The new installation program (see #3 below) comes with a Tomcat distribution, should a new installation need it, but it is no longer required.  There have been reports that the new WAR-based distribution works inside the Jetty servlet container; we&#8217;re hoping it will work in the JBoss Application Server as well (<a href="http://dltj.org/2006/10/java-framework/">since that is what we&#8217;re using to build our next generation interface</a>).</p><p><h2>2. Fedora authentication has been refactored to use servlet filters</h2><br />I&#8217;m not quite sure what this means, but I have hopes that it will make integration with <a href="http://shibboleth.internet2.edu/" title="Project Shibboleth home page">Shibboleth</a> easier.  Can anyone else see the path between FEDORA and Shibboleth and comment on it?</p><p><h2>3. A new Fedora installer makes it easy to get started with Fedora</h2><br />From the start, FEDORA required a Java servlet container in order to run.  To make the installation job easier for those that are not familiar with Java servlet containers, the FEDORA installation process did everything for you.  Now that the relationship between the FEDORA application and the servlet container have been flipped around (see #1 above), the core developers devised an easy-to-use installation application that mimics the simplicity of the previous installation style while allowing others to make use of FEDORA as an integrated application within an existing servlet container.</p><p><h2>4. GSearch service</h2><br />The original FEDORA search service, the appropriately-named &#8220;basic search,&#8221; indexes only the Dublin Core (DC) datastream of each object.  As has been mentioned on the Fedora-Users mailing list several times, the DC datastream is really meant as an administrative metadata datastream and not necessarily the full description of the object; <a href="http://dltj.org/2006/09/description-datastream/">that full description can be stored in other datastreams of a FEDORA object</a>.  Not only did basic search not index these other descriptive metadata streams, but it also wouldn&#8217;t index the full text of PDF, text, and other indexable datastreams.</p><p><span class="removed_link" title="http://defxws2006.cvt.dk/fedoragsearch/">GSearch</span> &mdash; where &#8220;G&#8221; stands for &#8220;General&#8221; but could equally well stand for &#8220;Gert&#8221; Schmeltz Pedersen, its lead developer from the Technical University of Denmark &mdash; does all of the above as a new component in the FEDORA Service Framework.  We extend our gratitude to Gert and his colleagues for contributing their work to the general FEDORA distribution as well as to <a href="http://www.deff.dk/" title="Danmarks Elektroniske Fag- og Forskningsbibliotek">DEFF, Denmark&#8217;s Electronic Research Library</a>, which funded the GSearch project.</p><p><h2>5. Journaling service</h2><br />Like a journaling file system or a journaling database, this capability allows one to capture all of the transactions applied to the repository and replay them against a secondary repository instance or to restore a repository from backup.</p><p><h2>6. Datastream checksums</h2><br />As part of its ingestion and maintenance functions, the FEDORA software can now calculate, store, and verify checksums of datastreams.  This helps ensure the integrity of the repository content, or at least detect when something goes wrong.</p><p><h2>7. Support for PostgreSQL</h2><br />In the battle between which relational database engine is best, FEDORA now supports most of the big ones out-of-the-box:  Oracle, MySQL, and new PostgreSQL.  Here at OhioLINK, we&#8217;ve started with MySQL but are considering a migration to PostgreSQL as our in-house, preferred RDBMS, so the timing of this announcement is great.</p><p><h2>8. Standard system logging with Log4J</h2><br />Put this one in the category of &#8220;playing nicely with others.&#8221;  We&#8217;ve already reaped the benefit of the refactored logging code in the client JAR file in a pre-release version of the code.</p><p><h2>9 and 10.  Bug fixes and many other enhancements</h2><br />The core code is evolving along a nice trajectory.  This is good to see for the health of the overall project!</p><p>Version 2.2 represents another monumental step towards the vision of a <b>F</b>lexible, <b>E</b>xtensible <b>D</b>igital <b>O</b>bject <b>R</b>epository <b>A</b>rchitecture.  Congratulations to the core developers for what sounds like is going to be a great release.<p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://comm.nsdl.org/pipermail/fedora-users/2006-December/002330.html to http://article.gmane.org/gmane.comp.cms.fedora-commons.user/2330/ on January 19th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://defxws2006.cvt.dk/fedoragsearch/ on January 19th, 2011.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/fedora-2-point-2/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Why FEDORA?  Answers to the FEDORA Users Interview Survey</title><link>http://dltj.org/article/fedora-users-interview-survey/</link> <comments>http://dltj.org/article/fedora-users-interview-survey/#comments</comments> <pubDate>Fri, 15 Sep 2006 22:48:40 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[DRC]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[Library SOA]]></category> <category><![CDATA[Unified Content Repository]]></category> <category><![CDATA[library service-oriented architecture]]></category><guid isPermaLink="false">http://dltj.org/2006/09/fedora-users-interview-survey/</guid> <description><![CDATA[The Fedora Outreach and Communications team is conducting a survey of the high-level sense of passion and commitment inherent in the Fedora community. I&#8217;ve posted some answers back to the FEDORA wiki on behalf of OhioLINK, and am also including &#8230; <a href="http://dltj.org/article/fedora-users-interview-survey/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2006/09/fedora-users-interview-survey/"></abbr><p>The <span class="removed_link" title="http://www.fedora.info/wiki/index.php/Fedora_Outreach_User_Group">Fedora Outreach and Communications team</span> is <span class="removed_link" title="http://www.fedora.info/wiki/index.php/The_Fedora_Users_Interview_Survey">conducting a survey</span> of the high-level sense of passion and commitment inherent in the Fedora community.  I&#8217;ve posted some answers back to the FEDORA wiki on behalf of OhioLINK, and am also including the responses here as it fits into the &#8220;Why FEDORA?&#8221; series of blog postings.  (If you are reading this through a RSS news reader, I think you&#8217;ll have to actually come to the DLTJ website and scroll down to the bottom of this post to see the table of contents of the series.)  On with the responses!</p><p><h2>How did you hear about Fedora?</h2></p><p>I first remember hearing about FEDORA at a Coalition for Networked Information meeting in 2003. I only really remember it in passing because what was being presented was so radical that I didn&#8217;t appreciate what was being described.</p><p>I next encountered FEDORA during a conference call with the Internet2 Shibboleth core developers in mid-2004. The topic was enabling cross-repository access management &#8212; a topic that is still a challenge today (although the Shibboleth team is working on it). But that time I really started to catch on with what the FEDORA team was doing, and started paying closer attention.</p><p><h2>Why did you chose Fedora?</h2></p><p>When I arrived as project manager to the Ohio Digital Resource Commons (DRC) project in January 2005, OhioLINK was on the path to expand their existing Documentum installation to include a hosted institutional repository service. The Ohio DRC Steering Committee reviewed and accepted a proposal to use FEDORA as the foundation of this new hosted institutional repository service primarily because OhioLINK would be working with peers to develop the service (rather than working in isolation as likely would have happened with a Documentum-based solution).</p><p><h2>Were there economic advantages to your project/org. in selecting Fedora?</h2></p><p>The open source, free-to-license nature of FEDORA was definitely an advantage. It allowed us to turn grant funding that would have been used to pay for additional Documentum modules and licenses into to salary for temporary-hire programmers. In that way we felt that we had a better control over our destiny by creating the application code ourselves rather than relying on consultants.</p><p><h2>What is Fedora&#8217;s unique role in your production system?</h2></p><p>OhioLINK is beginning to look at the Service Oriented Architecture (SOAs) software design paradigm, and FEDORA fits right into that model as the content repository for all of our digital objects. If anything, FEDORA&#8217;s nature as a best-of-breed content repository &#8212; and nothing else &#8212; encourages us to think along the likes of sooner than we might otherwise have done.</p><p><h2>Is there one specific Fedora attribute that enables your project/organization to accomplish your overall goals.</h2></p><p>The fact that FEDORA is completely agnostic to what is contained in a datastream &#8212; be it audio, video, image, dataset, PDF, Dublin Core, MODS, EAD, FGDC, TEI, etc. &#8212; means that we can truly pursue a goal of managing all of our content in one place. The robustness of the content repository functions allows us to consider more interesting questions such as how this different content is ultimately presented to the end user.</p><p><h2>Do you see yourself as an active member of the Fedora community? Why?</h2></p><p>Yes. FEDORA represents the ability to take long-term control over the destiny of our digital objects. If, for some reason, the existing core developers at Cornell and UVa disappeared, a vibrant user community (OhioLINK included) can pick up the task of maintaining the software for the collective good. And if no one but OhioLINK is left in a &#8220;FEDORA community&#8221; our job of migrating out of it, should we desire to do so, is eased by the fact that we have the full view of the source code to help us move content and services to a new platform.</p><p><h2>What would inspire you to become more involved?</h2></p><p>It would take the existence of more hours in the day, I&#8217;m afraid!</p><p><h2>What should be the mission of an ongoing Fedora organization?</h2></p><p>A FEDORA community should first and foremost inspire communication among users of the FEDORA software. Almost all of us are working with extremely limited resources, and it weakens our collective effort if there is duplicated work underway. This communication should include not only developers but also users of the software.<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://www.fedora.info/wiki/index.php/The_Fedora_Users_Interview_Survey on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://www.fedora.info/wiki/index.php/Fedora_Outreach_User_Group on January 13th, 2011.</p><div class='series_links'><a href='http://dltj.org/article/description-datastream/' title='Best Practice Proposal for a DESCRIPTION Datastream'>Previous in series</a></div>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/fedora-users-interview-survey/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> <item><title>Best Practice Proposal for a DESCRIPTION Datastream</title><link>http://dltj.org/article/description-datastream/</link> <comments>http://dltj.org/article/description-datastream/#comments</comments> <pubDate>Wed, 06 Sep 2006 19:58:42 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[DRC]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[Dublin Core]]></category> <category><![CDATA[libraries]]></category> <category><![CDATA[metadata]]></category><guid isPermaLink="false">http://dltj.org/2006/09/description-datastream/</guid> <description><![CDATA[OhioLINK is deep in the process of migrating content from our old Bulldog/Documentum-based system to, well, something else, and we&#8217;ve been talking about the treatment of the metadata in the course of the migration. I think it is safe to &#8230; <a href="http://dltj.org/article/description-datastream/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2006/09/description-datastream/"></abbr><p>OhioLINK is deep in the process of migrating content from our old Bulldog/Documentum-based system to, well, something else, and we&#8217;ve been talking about the treatment of the metadata in the course of the migration.  I think it is safe to say that the Bulldog asset management system (and Documentum, which bought and integrated Bulldog into its product line about five years ago) is not really known for its rich handling of metadata.  Or at least how the library community thinks of metadata:  Dublin Core, MIX, MODS, MARC, VRA Core, PREMIS, FGCD, etc. &mdash; all at the same time in the same application engine with structured crosswalks between them. <sup><a href="http://dltj.org/article/description-datastream/#footnote_0_109" id="identifier_0_109" class="footnote-link footnote-identifier-link" title="Reality check for those in the &amp;#8220;library community&amp;#8221; &amp;#8230; do you think of metadata in this way?">1</a></sup> I think it is also safe to say that pure, unqualified Dublin Core, the only datastream that is required for every FEDORA object, does not completely encompass the descriptive fidelity needed for our objects.  These observations, combined with reading <a href="http://www.hull.ac.uk/esig/repomman/documents/" title="RepoMMan Documents">a mid-term project report from the RepoMMan effort</a> in the U.K., got me thinking about metadata and how we should store it in FEDORA objects.  The outcome of that line of thinking is this proposal:  &#8220;to establish a practice of creating an in-line XML datastream with the label &#8216;DESCRIPTION&#8217; that contains the primary descriptive metadata for each object.&#8221;</p><p><h2>Rationale</h2><br />Although FEDORA mandates an unqualified Dublin Core datastream for every object, unqualified Dublin Core is not expressive enough to describe our objects.  Therefore I recommend establishing this practice so subsequent agents/consumers of the objects (internal disseminators and external applications) will know the location of the most expressive metadata for the object.</p><p><h2>Risks/Unknowns</h2></p><ul><li>FEDORA does not provide a mechanism to keep elements of the DESCRIPTION datastream in sync with the DC datastream.  Do we store common data elements (e.g. &#8220;creator&#8221;) in both places?  If so, our front-end applications would need to change the value of &#8220;creator&#8221; in two places and there is always the risk that they will get out of sync. How much real value is there in maintaining the FEDORA-mandated DC datastream?</li><li>There is no convention (that I know of) for a &#8220;primary descriptive metadata&#8221; datastream label in a FEDORA object, so &#8220;DESCRIPTION&#8221; is an arbitrary choice at this point.  Future practices may go against this decision (although the choice does set us up to start using datastream labels like &#8220;PRESERVATION&#8221; for PREMIS metadata and so forth).</li></ul><p><h2>Background</h2></p><p>In their &#8220;Experiences with Fedora&#8221; report, the RepoMMan team noted:</p><blockquote><p>&#8230;working with Fedora&#8217;s compulsory Dublin Core (DC) datastream started one thinking about the metadata that a repository object would eventually need and how this might be mapped onto the Dublin Core fields.  It was some considerable time later than an e-mail on the Fedora-users list made it clear that the inherent DC datastream was intended solely for Fedora&#8217;s internal use and not as the basis of external searches. <sup><a href="http://dltj.org/article/description-datastream/#footnote_1_109" id="identifier_1_109" class="footnote-link footnote-identifier-link" title="Richard Green, &amp;#8220;Experiences with Fedora during the project&amp;#8217;s first year&amp;#8221; Report D-D8, July 2006; page 8; retrieved 28-Aug-2006 from http://www.hull.ac.uk/esig/repomman/downloads/D-D8-fedora-exp-v10.pdf.">2</a></sup></p></blockquote><p>Even with our most simplest collection, we already know that unqualified Dublin Core will not be sufficient (most specifically, we had discussions about the lack of precision of &#8220;Date&#8221; and &#8220;Coverage&#8221; as compared to the field labels we already have in the Bulldog data dictionary).  It is important that our metadata be parsable by machine processes, so I would advocate the proposed practice rather than trying to &#8220;shoe-horn&#8221; our descriptions into unqualified Dublin Core with text labels added the values and the like.  And if we keep the machine parsable, we will have a wider variety of options for indexing the data and displaying it at the presentation layer.</p><p>The &#8220;in-line XML&#8221; part of this proposal means that the DESCRIPTION datastream would be &#8220;managed&#8221; by the FEDORA server (e.g. not external or referenced), so it would become part of the object in the content store.</p><p><h2>Example</h2><br />If we take for a moment what is displayed in the presentation layer for <span class="removed_link" title="http://worlddmc.ohiolink.edu/Science/Details?oid=4005859">a sample object from the Forestry collection</span> as the sum total of all of the descriptive metadata for an object of this collection, a corresponding DESCRIPTION datastream would look something like:<br />[xml]<br /><metadata xmlns="http://drc.ohiolink.edu/schema/"<br /> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"<br /> xsi:schemaLocation="http://drc.ohiolink.edu/schema/<br />http://drc.ohiolink.edu/schema/schema.xsd"<br /> xmlns:dc="http://purl.org/dc/elements/1.1/"<br /> xmlns:dcterms="http://purl.org/dc/terms/"></p><p> <dc :title>Catalpa speciosa, bignonoides and Kampfera seeds.</dc><br /> <dc :creator>Ohio Agricultural Experiment Station.  Dept. of<br />Forestry.</dc><br /> <dc :description>Catalpa speciosa, bignonoides and Kampfera seeds.<br />Item #2</dc><br /> <dc :contributor>Ohio Agricultural Research and Development<br />Center</dc><br /> <dc :date>1908-12</dc><br /> <dcterms :available xsi:type="dcterms:W3CDTF"><br /> 2003-04-17T00:00:00<br /> </dcterms><br /> <dc :type>photographic prints</dc><br /> <dc :identifier>hdl:21151</dc><br /> <dc :source>2</dc><br /> <dmci :spatial>Ohio</dmci><br /> <dc :rights>Copyright: Ohio State University</dc><br /> <dcterms :licence xsi:type="dcterms:URI"></p><p>http://library.osu.edu/sites/dlib/terms.html</p><p> </dcterms><br /></metadata><br />[/xml]<br /><h2>Comments?</h2><br />Reactions to the proposal?  A rational step forward, or is there a better way?<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://worlddmc.ohiolink.edu/Science/Details?oid=4005859 on December 31st, 2010.</p><h2>Footnotes</h2><ol class="footnotes"><li id="footnote_0_109" class="footnote">Reality check for those in the &#8220;library community&#8221; &#8230; do you think of metadata in this way?</li><li id="footnote_1_109" class="footnote">Richard Green, &#8220;Experiences with Fedora during the project&#8217;s first year&#8221; Report D-D8, July 2006; page 8; retrieved 28-Aug-2006 from <a href="http://www.hull.ac.uk/esig/repomman/downloads/D-D8-fedora-exp-v10.pdf" title="Experiences with Fedora during the project&#039;s first year">http://www.hull.ac.uk/esig/repomman/downloads/D-D8-fedora-exp-v10.pdf</a>.</li></ol><div class='series_links'><a href='http://dltj.org/article/representing-collections-in-fedora/' title='Representing Collections In FEDORA'>Previous in series</a> <a href='http://dltj.org/article/fedora-users-interview-survey/' title='Why FEDORA?  Answers to the FEDORA Users Interview Survey'>Next in series</a></div>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/description-datastream/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> <item><title>Analysis of CDL&#8217;s XTF textIndexer to Replace the Local Files with FEDORA Objects</title><link>http://dltj.org/article/xtf-fedora-2/</link> <comments>http://dltj.org/article/xtf-fedora-2/#comments</comments> <pubDate>Tue, 22 Aug 2006 20:57:57 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[DRC]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[California Digital Library]]></category> <category><![CDATA[digital libraries]]></category> <category><![CDATA[libraries]]></category> <category><![CDATA[xtf]]></category><guid isPermaLink="false">http://dltj.org/2006/08/xtf-fedora-2/</guid> <description><![CDATA[This is a continuation of the investigation about integrating the California Digital Library&#8217;s XTF software into the FEDORA digital object repository that started earlier. This analysis looks at the textIndexer module in particular, starting with an overview of how textIndexer &#8230; <a href="http://dltj.org/article/xtf-fedora-2/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2006/08/xtf-fedora-2/"></abbr><p>This is a continuation of the investigation about integrating the California Digital Library&#8217;s XTF software into the FEDORA digital object repository that <a href="http://dltj.org/2006/08/xtf-fedora-1">started earlier</a>.  This analysis looks at the textIndexer module in particular, starting with an overview of how textIndexer works now with filesystem-based objects and ending with an outline of how this could with reading objects from a FEDORA repository instead.</p><p><h2>XTF&#8217;s Native File System handler</h2></p><p>Natively, XTF wants to read content out of the file system.  The core of the processing is done in these two class files:</p><p><h3>TextIndexer.java</h3></p><p>The <code>main()</code> driver for ingesting content into the index.  It reads commandline arguments (<code>cfgInfo.readCmdLine( args, startArg );</code>) to determine the various parameters, one of which is the top of the document source directory (<code>String srcRootDir = Path.resolveRelOrAbs( xtfHomeFile, cfgInfo.indexInfo.sourcePath );</code>).  Assuming all goes well, it calls a method to open the Lucene index for writing, process files in the source directory, and close the Lucene index:<br />[java]<br />srcTreeProcessor.open( cfgInfo );<br />srcTreeProcessor.processDir( new File(srcRootDir), 0 );<br />srcTreeProcessor.close();<br />[/java]</p><p><h3>SrcTreeProcessor.java</h3></p><p><code>processDir()</code> is called recursively on the directory structure to process files in that directory.  For each directory, a <code>docBuf</code> XML-as-a-string buffer is consisting of an element for every directory entry. <code>docBuf</code> is fed into the SAXON processor along with the docSelector XSLT stylesheet.  The resulting XML is read node-by-node looking for file entries that have an &#8220;indexFile&#8221; tag.  For each matching node, it calls <code>processFile()</code> to index each entry.</p><p><code>processFile()</code> will run the prefilter XSLT against the file content, build the Lazy Tree (if possible and requested), create the <code>IndexSource</code> version by running the source document through the appropriate file type &#8220;*IndexSource&#8221; method (e.g. <code>PDFIndexSource()</code>, <code>XMLIndexSource</code>, and <code>MARCIndexSource()</code>) and queue the content for indexing by the Lucene indexer.</p><p><h2>Requirements for an Object Handler for textIndexer</h2><br />Based on this analysis, if one were to replace the TextIndexer.java and SrcTreeProcessor.java &#8220;front end&#8221; of textIndexer, I think these would be the pieces that would be requried.  (Note that some steps are skipped in this overview &#8212; any replacement of these two classes would need to be sure to do everything that those classes do now.)</p><ol type="1"><li>Parse command line and configuration file parameters to create an <span class="removed_link" title="http://texts-stage.cdlib.org/xtf/javadoc/org/cdlib/xtf/textIndexer/IndexerConfig.html">IndexerConfig</span> instance (guiding parameters for the indexer as a whole) and an <span class="removed_link" title="http://texts-stage.cdlib.org/xtf/javadoc/org/cdlib/xtf/textIndexer/IndexInfo.html">IndexInfo</span> instance (parameters specific to the identified index-name).</li><li>Specify a collection of objects that you want in index-name.</li><li>Open up a writable instance of the index-name&#8217;s Lucene index (<i>a la</i> <code>srcTreeProcessor.open( cfgInfo );</code>)</li><li>For each object to be put into index-name, do these things:<ol type="a"><li>Optionally, run the source object through a prefilter (an XSLT transformation used to restructure the source document just prior to indexing without changing the stored source document).</li><li>Optionally, remove a DOCTYPE declaration in the source object before it is indexed.</li><li>Set up an transformation object from the native file format to something that is XML and call <code>textProcessor.checkAndQueueText()</code> to add it to a queue to be processed.</li></ol></li><li>Close index-name&#8217;s Lucene index (<i>a la</i> <code>srcTreeProcessor.close();</code>), which should have the side effect of processing the queued text (<i>a la</i> <code>textProcessor.processQueuedTexts();</code>) which will ultimately create the Lazy Tree (if specified) and add the object to the Lucene index.</li><li>Optionally, compare the collection of objects that you want in index-name with what is actually in index-name before you started, and remove anything that wasn&#8217;t in the specified collection.</li></ol><p><h2>Considering a FEDORA-based XTF handler</h2><br />So, all-in-all, that doesn&#8217;t seem too bad.  Here is where we get to mix in some FEDORA pieces and see what we get in the end.</p><p>First off, in terms of dealing with &#8220;collections of source objects to be indexed&#8221; I think it would be best to have this start with one of our &#8220;collection aggregation&#8221; objects as the root level of a source collection.  We&#8217;d perform an RDF &#8220;isMemberOf&#8221; query against the resource index using the FEDORA PID of the aggregation object (and optionally make an &#8220;isMemberOf&#8221; query recursively against the returned set &#8212; as if one was drilling down a file system).</p><p>Secondly, to get the XML content to be indexed, each object would have a <code>getXML</code> disseminator (see <a href="http://dltj.org/2006/05/fedora-disseminators/">Thinking about Our FEDORA Disseminators</a> for background) that would render to XTF an XML version of itself.  If the source object is an XML-based object, it just returns the XML.  If the source object is a PDF or Word document or something that can be rendered into a text-like form, the disseminator would handle that.  If the source object is an image or audio clip, the disseminator can return the descriptive XML of the object.  The point being, though, by the time the object gets to XTF&#8217;s textIndexer, it has already be rendered to XML, so just the XML transformation tool would be needed (as in this snipped from SrcTreeProcessor.java):<br />[java]<br />IndexSource srcFile = null;<br />if( format.equalsIgnoreCase(&#8220;XML&#8221;) ) {<br /> InputSource finalSrc = new InputSource( systemId );<br /> srcFile = new XMLIndexSource( finalSrc, srcPath, key,<br /> preFilters, displayStyle, lazyStore );<br /> if( removeDoctypeDecl )<br /> ((XMLIndexSource)srcFile).removeDoctypeDecl( true );<br />}<br />[/java]</p><p>Third, a FEDORA-aware driver that replaces TextIndexer.java and SrcTreeProcessor.java.  Given a configuration file location and a starting PID, it would gather the objects to be indexed, &#8220;open&#8221; the Lucene index, run through the snippet of Java above for each object, and &#8220;close&#8221; the Lucene index.</p><p>The quick-and-dirty first implementation would copy the XML source to a directory on the hard drive (directory and subdirectory names would be the PID of the aggregation object containing the collection of objects), and have XTF use that local filesystem copy as the indexed source.  Lazy Tree files for each object would also be created and stored locally.  This means we have two copies (three, if you count the Lazy Tree) of the object laying around, so eventually I think we&#8217;d want to modify XTF to pull content directly from FEDORA using a REST-based URL.  Eventually I think we may also want to store the Lazy Tree in something other than the local file system.  Could that be another datastream in the FEDORA object?<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://texts-stage.cdlib.org/xtf/javadoc/org/cdlib/xtf/textIndexer/IndexerConfig.html on December 31st, 2010.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://texts-stage.cdlib.org/xtf/javadoc/org/cdlib/xtf/textIndexer/IndexInfo.html on December 31st, 2010.</p><div class='series_links'><a href='http://dltj.org/article/xtf-fedora-1/' title='CDL&#8217;s XTF as a Front End to Fedora'>Previous in series</a> <a href='http://dltj.org/article/xtf-fedora-3/' title='XTF and FEDORA &mdash; Comments from the Community'>Next in series</a></div>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/xtf-fedora-2/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> <item><title>CDL&#8217;s XTF as a Front End to Fedora</title><link>http://dltj.org/article/xtf-fedora-1/</link> <comments>http://dltj.org/article/xtf-fedora-1/#comments</comments> <pubDate>Tue, 22 Aug 2006 13:29:09 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[DRC]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[California Digital Library]]></category> <category><![CDATA[digital libraries]]></category> <category><![CDATA[libraries]]></category> <category><![CDATA[OhioLINK]]></category> <category><![CDATA[xtf]]></category><guid isPermaLink="false">http://dltj.org/2006/08/xtf-fedora-1/</guid> <description><![CDATA[We&#8217;re experimenting pretty heavily now with the California Digital Library&#8216;s XTF framework as a front-end to a FEDORA object repository. Initial efforts look promising &#8212; thanks go out to Brian Tingle and Kirk Hastings of CDL; Jeff Cousens, Steve DiDomenico, &#8230; <a href="http://dltj.org/article/xtf-fedora-1/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2006/08/xtf-fedora-1/"></abbr><p>We&#8217;re experimenting pretty heavily now with the <a href="http://cdlib.org/" title="California Digital Library">California Digital Library</a>&#8216;s <a href="http://sourceforge.net/projects/xtf" title="SourceForge.net: eXtensible Text Framework (XTF)">XTF</a> framework as a front-end to a <a href="http://www.fedora.info/" title="Fedora">FEDORA object repository</a>.  Initial efforts look promising &#8212; thanks go out to Brian Tingle and Kirk Hastings of CDL; Jeff Cousens, Steve DiDomenico, and Bill Parod from Northwestern; and Ross Wayland from UVa for helping us along in the right direction.</p><p><h2>XTF into Eclipse How-To</h2><br />As we get more serious about XTF, I wrote up a <span class="removed_link" title="http://drc-dev.ohiolink.edu/wiki/EclipseXTFHowTo">How-To document for bringing XTF into Eclipse</span> so that it can be deployed as a dynamic web application.  Let me know if you find it useful.  Definitely let me know if you find it in error.  We haven&#8217;t put a version of XTF into OhioLINK&#8217;s source code repository, but that might follow shortly.</p><p><h2>Points of Integration</h2><br />In its base configuration, XTF reads documents out of a &#8220;data&#8221; directory that is in the application&#8217;s Tomcat context directory.  It looks like two of the XTF components will need to be modified to successfully converse with a FEDORA-based object repository:  DynaXML and textIndexer.  Of the two, DynaXML seems to be the most straight forward.</p><p><h3>DynaXML</h3><br />First I went looking for where XTF&#8217;s DynaXML reads documents and found the <a href="http://xtf.hg.sourceforge.net/hgweb/xtf/xtf/file/549e4167039e/WEB-INF/src/org/cdlib/xtf/dynaXML/DocLocator.java" title="">DocLocator interface</a> with one <a href="http://xtf.hg.sourceforge.net/hgweb/xtf/xtf/file/de7d8a406bef/WEB-INF/src/org/cdlib/xtf/dynaXML/DefaultDocLocator.java" title="">implementation that looks into the file system</a>.  John Davison, one of the DRC programmers, figured out (with help from the CDL folks) that in fact it is possible to pass a FEDORA API-A URL to DefaultDocLocator and have it do the right thing.  Its &#8216;getInputSource()&#8217; method has this signature:</p><p>[java]<br />public InputSource getInputSource( String sourcePath,<br /> boolean removeDoctypeDecl ) throws IOException<br />[/java]<br />&#8230;followed shortly by:</p><p>[java]<br />// If it&#8217;s non-local, load the URL.<br />if( sourcePath.startsWith(&#8220;http:&#8221;) ||<br /> sourcePath.startsWith(&#8220;https:&#8221;) )<br />{<br /> return new InputSource( sourcePath );<br />}<br />[/java]<br />where &#8220;InputSource&#8221; is the <a href="http://www.docjar.com/docs/api/org/xml/sax/InputSource.html" title="InputSource">entry point into the SAX parser</a>, which will accept a URI as a parameter.</p><p>Unfortunately, using DefaultDocLocator in this way negates the use of <span class="removed_link" title="http://xtf.sourceforge.net/WebDocs/HTML/XTF_Under_Hood/XTFUnderHood.html#LazyFiles">CDL&#8217;s &#8220;Lazy Trees&#8221;</span> (a binary version of each XML document containing all the original contents of the document, plus an index telling XTF where each element starts and ends).  Lazy Trees are a good thing because they speed up parsing of the XML document and the resulting rendering to the user.</p><p>When dealing with local files (as opposed to the URL method described above), DefaultDocLocator will build a Lazy Tree in its index directory the first time the XML document is called up.  In implementing a FEDORA interface for XTF&#8217;s DynaXML, what is required is a mixture of URL (or, in the case of FEDORA, a PID plus API-A call) to get the document and then create/store its lazy tree in the XTF index directory for subsequent retrieval.  This does seem pretty straight forward, does it not?</p><p><h3>textIndexer</h3><br />XTF&#8217;s textIndexer, on the other hand, really wants the XML it is indexing to be files on the local hard drive.  The XTF programming guide speaks of a <span class="removed_link" title="http://xtf.sourceforge.net/WebDocs/HTML/XTF_Programming_Guide/XTFProgGuide.html#textIndexer_DocSelector_Prog">textIndexer Document Selector</span> whose job it is to create a single XML file with the specifications of which documents to index and how to do it:</p><blockquote><p>It is the responsibility of the <b>Document Selector</b> XSLT code to output an XML fragment that identifies which of the files in the directory should be indexed. This output XML fragment should take the following form:<br />[xml]<br /><indexfiles><br /> <indexfile fileName      = "FileName"<br /> {format       = "FileFormatID"}<br /> {preFilter    = "PreFilterPath"}<br /> {displayStyle = "DocumentFormatterPath"}><br /></indexfile></indexfiles><br />[/xml]</p></blockquote><p>Now the trick seems to be to build an alternate Document Selector that will not use filenames but rather URIs to build the index.  That&#8217;ll be the subject of the next round of investigations.</p><p>Comments and observations are welcome!<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://drc-dev.ohiolink.edu/wiki/EclipseXTFHowTo on December 31st, 2010.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://xtf.sourceforge.net/WebDocs/HTML/XTF_Under_Hood/XTFUnderHood.html#LazyFiles on December 31st, 2010.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://xtf.sourceforge.net/WebDocs/HTML/XTF_Programming_Guide/XTFProgGuide.html#textIndexer_DocSelector_Prog on December 31st, 2010.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://xtf.cvs.sourceforge.net/xtf/xtf/WEB-INF/src/org/cdlib/xtf/dynaXML/DocLocator.java?revision=1.5&#038;view=markup to http://xtf.hg.sourceforge.net/hgweb/xtf/xtf/file/549e4167039e/WEB-INF/src/org/cdlib/xtf/dynaXML/DocLocator.java on January 28th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://xtf.cvs.sourceforge.net/xtf/xtf/WEB-INF/src/org/cdlib/xtf/dynaXML/DefaultDocLocator.java?revision=1.10&#038;view=markup to http://xtf.hg.sourceforge.net/hgweb/xtf/xtf/file/de7d8a406bef/WEB-INF/src/org/cdlib/xtf/dynaXML/DefaultDocLocator.java on January 28th, 2011.</p><div class='series_links'> <a href='http://dltj.org/article/xtf-fedora-2/' title='Analysis of CDL&#8217;s XTF textIndexer to Replace the Local Files with FEDORA Objects'>Next in series</a></div>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/xtf-fedora-1/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Representing Collections In FEDORA</title><link>http://dltj.org/article/representing-collections-in-fedora/</link> <comments>http://dltj.org/article/representing-collections-in-fedora/#comments</comments> <pubDate>Tue, 25 Jul 2006 19:28:44 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[DRC]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[digital libraries]]></category> <category><![CDATA[library service-oriented architecture]]></category> <category><![CDATA[metadata]]></category> <category><![CDATA[open source]]></category> <category><![CDATA[RDF]]></category> <category><![CDATA[standards]]></category><guid isPermaLink="false">http://dltj.org/2006/07/representing-collections-in-fedora/</guid> <description><![CDATA[One of the DRC developers had a question recently that sparked a discussion about what to do with collections of objects. In order to answer the question of how to represent the notion of a collection within the repository, we&#8217;re &#8230; <a href="http://dltj.org/article/representing-collections-in-fedora/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2006/07/representing-collections-in-fedora/"></abbr><p>One of the DRC developers had a question recently that sparked a discussion about what to do with collections of objects.  In order to answer the question of how to represent the notion of a collection within the repository, we&#8217;re going to have to get pretty heavy into RDF:  the Resource Description Framework.  RDF is a language created by the Worldwide Web Consortium &#8220;for representing information about resources in the World Wide Web.&#8221;  If you already know about RDF &#8212; or just want to see what a proposed solution is &#8212; you can skip down to the &#8220;<a href="#nid91L">RDF for Collections in FEDORA</a>&#8221; heading.</p><p>At the preface, I have to say that I&#8217;m increasingly uncomfortable with the word &#8220;collection&#8221; because it has become so overloaded in library usage, and like Carl Lagoze prefer the term &#8220;aggregation&#8221; to describe in a general sense what we think a collection is and what it could become.  I probably bounce back and forth between the terms here, but am aiming to use &#8220;aggregation&#8221; and &#8220;aggregation object&#8221; more often.</p><p>I&#8217;m going to be pulling a lot of examples and language from the &#8220;<a href="http://www.w3.org/TR/rdf-primer/" title="RDF Primer">RDF Primer</a>&#8220;, which I would recommend reading at some point.  It is a very long, dense document, but if you can get through it you&#8217;ll have a very good understanding of what RDF is and what is does for us.</p><p>The Primer describes RDF this way:  &#8220;It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource&#8230;.  RDF is based on the idea that the things being described have properties which have values, and that resources can be described by making statements &#8230; that specify those properties and values.&#8221;</p><p>There are three parts to an RDF statement about an object.  &#8220;[T]he part that identifies the thing the statement is about (the Web page in this example) is called the <em>subject</em>. The part that identifies the property or characteristic of the subject that the statement specifies (creator, creation-date, or language in these examples) is called the <em>predicate</em>, and the part that identifies the value of that property is called the <em>object</em>.&#8221;</p><p>These component make up what is called an &#8220;RDF triple.&#8221;  When written in tabular form an RDF triple is conventionally written in the order subject, predicate, object.  To represent RDF statements in a machine-processable way, RDF uses the Extensible Markup Language [XML]. RDF defines a specific XML markup language, referred to as RDF/XML, for use in representing RDF information, and for exchanging it between machines.</p><p>For instance, imagine trying to state the (nominally, Dublin Core) descriptive metadata about a web page called http://www.example.org/index.html.  In natural language, the descriptive elements could be:</p><div style="padding: 10px; margin: 0.67em auto; border: thin solid silver; font-size: 85%; color: black; background: #FFC"> <strong>http://www.example.org/index.html</strong> has a <strong>creator</strong> whose value is <strong>John Smith</strong><br /> <strong>http://www.example.org/index.html</strong> has a <strong>creation-date</strong> whose value is <strong>Aug 16, 1999</strong><br /> <strong>http://www.example.org/index.html</strong> has a <strong>language</strong> whose value is <strong>English</strong></div><p>In tabular form, this could look like:</p><div style="padding: 10px; margin: 0.67em auto; border: thin solid silver; font-size: 85%; color: black; background: #FFC"><table></table></div><tr><th>Subject</th><th>Predicate</th><th>Object</th></tr><tr><td>http://www.example.org/index.html</td><td>creator</td><td>John Smith</td></tr><tr><td>http://www.example.org/index.html</td><td>creation-date</td><td>Aug 16, 1999</td></tr><tr><td>http://www.example.org/index.html</td><td>language</td><td>English</td></tr><p>In XML, this could look like:</p><div style="padding: 10px; margin: 0.67em auto; border: thin solid silver; font-size: 85%; color: black; background: #FFC"><pre> &lt;?xml version="1.0"?&gt; &lt;rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"      xmlns:dc="http://purl.org/dc/elements/1.1/"&gt;   &lt;rdf:Description rdf:about="http://ex.org/i.html"&gt;     &lt;dc:creator&gt;John Smith&lt;/dc:creator&gt;     &lt;dc:creation-date&gt;Aug 16, 1999&lt;/dc:creation-date&gt;     &lt;dc:language&gt;English&lt;/dc:language&gt;   &lt;/rdf:Description&gt; &lt;/rdf:RDF&gt;</pre></div><p>Keep in mind, though, that we expressed the predicate here as Dublin Core; the predicate can be anything &#8212; even something you make up!</p><p><h2>RDF for Collections in FEDORA</h2></p><p>RDF is used throughout FEDORA &#8212; in fact, the Dublin Core properties can (and in our FEDORA configuration, are) expressed as RDF triples in an internal database and can be searched as such.  But the RDF triples can be used to express more than just attributes about an object &#8212; it can be used to express /relationships/ between objects.  There is a whole section of the FEDORA docs called &#8220;<a href="http://www.fedora.info/download/2.1.1/userdocs/digitalobjects/introRelsExt.html" title="Fedora Digital Object Relationships">Fedora Digital Object Relationships</a>&#8221; that goes into more detail.  Quotations and examples in this section are drawn from that document.</p><p>&#8220;Fedora digital objects can be related to other Fedora objects in many ways.  For example there may be a Fedora object that represents a collection and other objects that are members of that collection.  Also, it may be the case that one object is considered a part of another object, a derivation of another object, a description of another object, or even equivalent to another object.&#8221;</p><p>FEDORA comes with a <a href="http://www.fedora.info/definitions/1/0/fedora-relsext-ontology.rdfs" title="http://www.fedora.info/definitions/1/0/fedora-relsext-ontology.rdfs">list of common relationships between objects</a>, and other community or user-defined relationships may also be asserted.  These relationships can be expressed in RDF notation:</p><div style="padding: 10px; margin: 0.67em auto; border: thin solid silver; font-size: 85%; color: black; background: #FFC"><table></table></div><tr><th>Subject</th><th>Predicate</th><th>Object</th></tr><tr><td>&lt;subjectFedoraObject&gt;</td><td>&lt;relationshipProperty&gt;</td><td>&lt;targetFedoraObject&gt;</td></tr><tr><td>MyCatVideo</td><td>is a member of collection</td><td>GreatCatVideos</td></tr><tr><td>drc:101</td><td>isMemberOf</td><td>drc:100</td></tr><tr><td>drc:101</td><td>isFromInstitution</td><td>mu3ug</td></tr><pre> &lt;?xml version="1.0"?&gt; &lt;rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"   xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"   xmlns:fedora="info:fedora/fedora-system:def/relations-external#"   xmlns:drcns="http://drc.ohiolink.edu/ontologies/relationships#"&gt;   &lt;rdf:Description rdf:about="info:fedora/drc:101"&gt;      &lt;fedora:isMemberOf rdf:resource="info:fedora/drc:100" /&gt;      &lt;drcns:isFromInstitution&gt;mu3ug&lt;/drcns:isFromInstitution&gt;   &lt;/rdf:Description&gt; &lt;/rdf:RDF&gt;</pre><p>&#8220;drc:100&#8243; is an aggregation object (otherwise known as a &#8220;collection object&#8221;, but I&#8217;ve learned from others in the FEDORA community that &#8220;collection&#8221; is too loaded of a word) of which &#8220;drc:101&#8243; is a member.  To put it in terms that we may be familiar with:</p><ul><li>drc:100 is the aggregation object for the &#8220;Charles E. Frohman Collection&#8221;</li><li>drc:101 is a digital image of a photograph with the title &#8220;Work Crew&#8221; that is part of the Charles E. Frohman collection</li><li>drc:101 is a digital image contributed by member institution &#8220;mu3ug&#8221;</li></ul><p><h2>Conclusions</h2><br />So the issue becomes, I believe, to examine the <a href="http://www.fedora.info/definitions/1/0/fedora-relsext-ontology.rdfs" title="http://www.fedora.info/definitions/1/0/fedora-relsext-ontology.rdfs">pre-loaded set of relationships</a> to match those against the existing relationships in the DMC and then do define any kind of unique relationships (such as &#8220;isFromInstitution&#8221;) that we would want to express about our objects.</p><div class='series_links'><a href='http://dltj.org/article/fedora-disseminators-for-accessibility/' title='Fedora Disseminators to Enable Accessible Repository Content'>Previous in series</a> <a href='http://dltj.org/article/description-datastream/' title='Best Practice Proposal for a DESCRIPTION Datastream'>Next in series</a></div>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/representing-collections-in-fedora/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> </channel> </rss>
<!-- Served from: dltj.org @ 2012-02-11 08:27:05 by W3 Total Cache -->
