JPEG2000 to Zoomify Code4Lib Lightning Talk Video Now Available

Thanks, Noel, and everyone else who made the video editions of Code4Lib 2008 presentations possible. I just had a chance to notice that the video from my JPEG2000 to Zoomify Shim lightning talk was online:

Some updates since the post and the presentation were first done. The code that exists in the source code repository now was refactored to use JJ2000 as part of the Sun ImageIO package. We were seeing non-threadsafe problems with Kakadu and thought that using the multithreaded ImageIO package would help. Unfortunately, even with extensive caching, it did not. My next task is to bring Kakadu back into the picture using the threadsafe JNI implementation that is part of the ImageIO-ext project to see if that helps.

Unfortunately, time ran out before this needed to go into initial production with the OhioLINK DRC roll-out, so it isn’t in production. The scheme shows promise, though, so I’m going to keep working with it…

The text was modified to update a link from to on January 28th, 2011.

The text was modified to update a link from to on November 13th, 2012.

The text was modified to update a link from to on August 22nd, 2013.

The text was modified to update a link from to on August 22nd, 2013.

JPEG2000 to Zoomify Shim — Creating JPEG tiles from JPEG2000 images

This is a textual representation of a lightning talk done on Feb 26th at Code4Lib 2008. When the video of the talk is up (thanks, Noel!) I’ll link it here, too. The video is now available, and that article includes an update on progress since the this article was posted.

OhioLINK has a collection of JPEG2000 images as an access format that were generated for use in our DLXS-based content system. We are in the process of migrating those collections to DSpace and were looking for a mechanism to leverage the existing JPEG2000 files and not have to generate new derivatives. We are also considering the use of JPEG2000 as a preservation format, and would find it attractive to use the same image format for both access copies and preservation copies. We looked at Zoomify, but to perform its scaling function it generates JPEG tiles at several resolutions and storing those tiles can triple or quadruple disk space requirements. Or, one could use the ‘enterprise’ version of Zoomify and its proprietary PFF format or the equally proprietary MrSID format. We didn’t want to be locked into either of these scenarios. Our solution is to create a web application that mimics the directory-of-JPEG-tiles solution, but to dynamically generate the tiles our of a JPEG2000 master.

The free version of Zoomify reads JPEG tiles out of a directory structure that looks like this:

/ImageProperties.xmlIncludes descriptive elements of the source image like height, width, and tile size./TileGroup0/0-0-0.jpgThe highest power-of-2 zoom out level that creates an image with dimensions less than 256×256/TileGroup0/1-0-0.jpgThe tile at the upper left corner at the first power-of-2 zoom level/TileGroup0/1-1-0.jpgThe tile to the left of 1-0-0.jpg

The shim mimics that directory structure. It parses the URL of the request and dynamically creates the appropriate JPEG tile (or metadata file) out of the JPEG2000 image.

The Code

The JPEG2000 for Zoomify shim requires Java 1.5 or greater. It does not require a servlet engine; rather, it uses the Restlet library to perform as a stand-alone application. The OneJar library allows the Java classes and required dependencies to be bundled into a single JAR file. We’re using the Kakadu Software JPEG2000 library to perform the on-the-fly decoding of JPEG2000 images. Kakadu is a commercial JPEG2000 codec, although inexpensive licenses are available for not-for-profit activity. We are using the Enterprise version of Zoomify, a Flash-based image viewer, although I believe the free version will work as well. (You’ll need the Enterprise version to be able to modify and adapt the appearance of the Zoomify applet.) The same techniques can also be used for other Flash applets and probably even JavaScript-based viewers (a la Google Maps).

The source code is available from the OhioLINK DRC source code repository (Subversion access). We plan to integrate it into DSpace 1.5 as part of the Ohio Digital Resource Commons, and I may create a Fedora disseminator to serve up the tiles as well.

Thanks go out to Keith Gilbertson and John Davison on the OhioLINK staff for their help in making this work as well as Stu Hicks and François d’Erneville for being a sounding board for these ideas.

The text was modified to update a link from http://code4lib/conference/2008 to on January 28th, 2011.

The text was modified to update a link from to on January 28th, 2011.

Voting open for Code4Lib 2009; Central Ohio is a candidate

The Columbus Metropolitan Library, OCLC, and Ohio State University and OhioLINK have put in a bid as host site for the 2009 Code4Lib meeting. Code4Lib is an informal organization of self-selected librarians and technology professionals. It exists as a volunteer organization run by consensus of interested individuals. The meeting in 2009 will be the fifth fourth1 face-to-face meeting of this group. Details of the central Ohio host location proposal are on the web at

Information about becoming a member of the Code4Lib community and voting in the host site selection process are included below.

The meeting is conducted in an “unconference” or “barCamp” format. It is a highly democratic style consisting of prepared talks, “lightning talks” (described below) and breakouts; the meeting schedule is divided almost equally between these three components. Prepared talks are 20 minutes long and are proposed by speakers prior to the meeting. Proposals are voted on by the entire Code4Lib community, and the highest ranking ones are slotted into the schedule. “Lightning talks” are 5 minutes long and are assigned on a first-come, first-scheduled basis at the start of the meeting. Prepared talks and lightning talks are presented to the entire attendee body (e.g. a single-track meeting); they are also usually recorded and published to the web after the meeting. Time slots for breakouts are built into the schedule and rooms are provided by the conference organizers. Attendees create breakout sessions at the meeting on any topic on a first-come, first-scheduled basis.

If you have any questions about Code4Lib in general or the central Ohio site proposal in particular, please let me know.

Code4Lib Host Site Voting Process

Adapted from a message by Mike Giarlo.

We received four very good proposals for hosting the 2009 conference, and now it is time to vote on them! Voting is open until 3am Eastern Time on Thursday, February 28th. We expect to announce results at the conference later that day.

How to vote:

  1. Go here:
  2. Log in using your credentials (register at if you haven’t done so already)
  3. Click on a host’s name to read the proposal in full
  4. Assign the proposal a rank from 0 to 3, 0 being least desirable and 3 being the most.
  5. Once you are satisfied with your rankings, click “Cast your ballot”

Feel free to watch for returns.

And as always, if you have questions or other feedback, let us know.


P.S. Your vote counts! Please keep the conference requirements and desirables in mind as you make your selection:

P.P.S. The election not powered by Diebold.

The text was modified to update a link from to on January 28th, 2011.

The text was modified to update a link from to on January 28th, 2011.

The text was modified to update a link from to on November 21st, 2012.


  1. Thanks for the correction, Mike! []

Google Custom Search’s Planet Code4Lib as an OpenSearch Plugin

Earlier I mentioned creating a Google Custom Search for Planet Code4Lib. The Google-supplied markup puts a form on your web page that leads to Google’s server farm. (Alternatively, you can create a custom URL that points to an HTML page at Google which contains the form.) Well, that’s really neat, but not far enough. How about an OpenSearch plugin suitable for Firefox and MSIE7? Here is the plugin markup:

< ?xml version="1.0" encoding="UTF-8"?>
 <opensearchdescription xmlns="" xmlns:moz="">
   <shortname>Planet Code4Lib</shortname>
   <description>Search the bloggers of Planet Code4Lib using Google Custom Search.</description>
   <tags>code4lib library</tags>
   <url type="text/html" template="{searchTerms}&amp;cx=017716194421589436379:zdoxzpetaxk&amp;sa=Search&amp;cof=FORID:0">
      <image height="16" width="16" type="image/png">
      <moz :searchform>

Pretty neat, eh? This link will install the search definition in Firefox and MSIE7.

Is this going too far?

One can’t help but to wonder whether this violates the Google Custom Search Terms of Service. Here is a piece of 1.1 Description of Service.

For purposes of the Terms of Use, “Site” shall mean the Web site or sites on which You place JavaScript or similar programming (“Code”) which renders the Google search box (or other means used by users of the Site (“End Users”) to enter a search query (“Query”)) on the Site (“Search Box”). All Queries sent from the Site to Google shall comply with the technical specifications that Google may provide from time to time, and and must originate from the Site.

So I’m not really using JavaScript, but I am using XML markup. Can “Site” mean the user’s web browser interface? Further on in the ToS:

1.3 Your Obligations.
You shall receive a Query from the End User and shall forward that Query
to Google. You may not in any way frame or cache the Results produced
by Google, except as otherwise agreed to between You and Google. Google
will not be responsible for receiving Queries from End Users or for
transmission of data between You and Google’s network interface. You
shall be responsible for providing all hardware and software required
to perform Your obligations under the Terms of Use, including but not
limited to the following: (a) implementing and maintaining the Site,
(b) implementing and maintaining the interface between the Site and
the Service, and (c) receiving a Query from an End User and transmitting
the Query to Google.

So with the search plugin, I’m not receiving the query — rather I’m facilitating the process of forwarding the query from the user’s browser to Google. So far, so good, I think. The search plugin doesn’t frame or cache the results; I’m okay with that clause. With regard to my obligations, I’ll maintain DLTJ as the source of the OpenSearch XML configuration file (unless someone wants to put it directly on somewhere), but again DLTJ is not sitting between the end user and Google so I don’t think points (b) and (c) apply.

Too much legalese. In the spirit of mashups everywhere, I’ll put this out and ask for forgiveness if it violates Google’s sensibilities rather than asking for permission first.

Google Custom Search for Planet Code4Lib

I wanted to mess around with Google’s new Custom Search Engine feature and in casting about for a list of URLs to feed it I thought I’d try the list of blogs at Planet Code4Lib. As it turns out, this might be a modestly useful search if you remember reading something from one of the code4lib bloggers but can’t remember which one. The exercise was pretty fun and here is the result:

To build it, I started with the Planet Code4Lib OPML feed and ran some regular expression transformations against it, replacing these matches with empty strings (I used BBEdit on the Mac for this one-off, but it could probably be automated with a PERL script to a certain degree):


After a minimal amount of manual cleanup, I ended up with this list:******************************************************************

…and fed that into the Google Custom Search control panel.

Items of note in the Terms of Service

Along the way I found some curious bits in the Google Custom Search Terms of Service. In particular:

1.5 Exclusivity. You agree that, during the Term, Google will be the exclusive provider of Internet search services on the Site. You further understand that Google will provide the Service on a nonexclusive basis, and that Google will continue to customize and provide its services to other parties for use in connection with a variety of applications, including search engine applications.

Section 1.1 defines Site this way:

For purposes of the Terms of Use, “Site” shall mean the Web site or sites on which You place JavaScript or similar programming (“Code“) which renders the Google search box (or other means used by users of the Site (“End Users“) to enter a search query (“Query“)) on the Site (“Search Box“).

One suspects what Google meant was that if you put up a Custom Search Box on your Site, then you must also use Google for any general internet search you might have — you can’t have a Google Custom Search Box and a Yahoo search box on the same Site, for instance. I imagine that this also effectively locks out other internet search engine providers from offering the same service. Since Google is the first-to-market, if Yahoo were to come up with a similar service you couldn’t put a Google Custom Search and a Yahoo custom search pointing to each providers indexes with the same subset of URLs. Since we know that each index contains different stuff and ranks results with different algorithms, one might imagine that the same custom search segments over a multiplicity of indexes could be a useful thing.

Ah, well — it is still useful. Just go in with your eyes open…

The text was modified to update a link from to on January 13th, 2011.