×
This article was imported from this blog's previous content management system (WordPress), and may have errors in formatting and functionality. If you find these errors are a significant barrier to understanding the article, please let me know.
We're experimenting pretty heavily now with the California Digital Library's XTF framework as a front-end to a FEDORA object repository. Initial efforts look promising -- thanks go out to Brian Tingle and Kirk Hastings of CDL; Jeff Cousens, Steve DiDomenico, and Bill Parod from Northwestern; and Ross Wayland from UVa for helping us along in the right direction.
XTF into Eclipse How-To
As we get more serious about XTF, I wrote up a How-To document for bringing XTF into Eclipse so that it can be deployed as a dynamic web application. Let me know if you find it useful. Definitely let me know if you find it in error. We haven't put a version of XTF into OhioLINK's source code repository, but that might follow shortly.
Points of Integration
In its base configuration, XTF reads documents out of a "data" directory that is in the application's Tomcat context directory. It looks like two of the XTF components will need to be modified to successfully converse with a FEDORA-based object repository: DynaXML and textIndexer. Of the two, DynaXML seems to be the most straight forward.
DynaXML
First I went looking for where XTF's DynaXML reads documents and found the DocLocator interface with one implementation that looks into the file system. John Davison, one of the DRC programmers, figured out (with help from the CDL folks) that in fact it is possible to pass a FEDORA API-A URL to DefaultDocLocator and have it do the right thing. Its 'getInputSource()' method has this signature:
Unfortunately, using DefaultDocLocator in this way negates the use of CDL's "Lazy Trees" (a binary version of each XML document containing all the original contents of the document, plus an index telling XTF where each element starts and ends). Lazy Trees are a good thing because they speed up parsing of the XML document and the resulting rendering to the user.
When dealing with local files (as opposed to the URL method described above), DefaultDocLocator will build a Lazy Tree in its index directory the first time the XML document is called up. In implementing a FEDORA interface for XTF's DynaXML, what is required is a mixture of URL (or, in the case of FEDORA, a PID plus API-A call) to get the document and then create/store its lazy tree in the XTF index directory for subsequent retrieval. This does seem pretty straight forward, does it not?
textIndexer
XTF's textIndexer, on the other hand, really wants the XML it is indexing to be files on the local hard drive. The XTF programming guide speaks of a textIndexer Document Selector whose job it is to create a single XML file with the specifications of which documents to index and how to do it:
It is the responsibility of the Document Selector XSLT code to output an XML fragment that identifies which of the files in the directory should be indexed. This output XML fragment should take the following form:
Now the trick seems to be to build an alternate Document Selector that will not use filenames but rather URIs to build the index. That'll be the subject of the next round of investigations.
Comments and observations are welcome!
The text was modified to remove a link to http://drc-dev.ohiolink.edu/wiki/EclipseXTFHowTo on December 31st, 2010.
The text was modified to remove a link to http://xtf.sourceforge.net/WebDocs/HTML/XTF_Under_Hood/XTFUnderHood.html#LazyFiles on December 31st, 2010.
The text was modified to remove a link to http://xtf.sourceforge.net/WebDocs/HTML/XTF_Programming_Guide/XTFProgGuide.html#textIndexer_DocSelector_Prog on December 31st, 2010.
The text was modified to update a link from http://xtf.cvs.sourceforge.net/xtf/xtf/WEB-INF/src/org/cdlib/xtf/dynaXML/DocLocator.java?revision=1.5&view=markup to http://xtf.hg.sourceforge.net/hgweb/xtf/xtf/file/549e4167039e/WEB-INF/src/org/cdlib/xtf/dynaXML/DocLocator.java on January 28th, 2011.
The text was modified to update a link from http://xtf.cvs.sourceforge.net/xtf/xtf/WEB-INF/src/org/cdlib/xtf/dynaXML/DefaultDocLocator.java?revision=1.10&view=markup to http://xtf.hg.sourceforge.net/hgweb/xtf/xtf/file/de7d8a406bef/WEB-INF/src/org/cdlib/xtf/dynaXML/DefaultDocLocator.java on January 28th, 2011.
Eighteen years ago, on Friday, September 7th, 2001, I was honored to be asked to participate in a naturalization ceremony for 46 new citizens of the United S...
These are the presentation notes for the Engaging with Open Source Technologies presentation during the Open Source Publishing Technologies: Current Status a...
These are the presentation notes for the Ensuring System Interoperability presentation during the Readers and Ebooks: Making The Connection during the NISO/B...
So I’m paying more attention to the DLTJ blog now, and one of the things I quickly noticed is that the Atom syndication feed was broken.
Or, at least modern ...