Open Repositories Presentation: Building an IR Interface Using EJB3 and JBoss Seam

Below is the outline of the Ohio DRC presentation from today’s FEDORA session at Open Repositories conference. Comments welcome!

Building an Institutional Repository Interface Using EJB3 and JBoss Seam

This tour is designed to show the overall architecture of a FEDORA digital object repository application within the JBoss Seam framework while at the same time pointing out individual design decisions and extension points that are specific to the Ohio Digital Resource Commons application. Geared towards software developers, a familiarity with Java Servlet programming is assumed, although not required. Knowledge of JBoss Seam, Hibernate/Java Persistence API, EJB3 and Java EE would be helpful but not required; brief explanations of core concepts of these technologies are included in this tour.

The tour is based on revision 709 of /drc/trunk and was last updated on 18-Jan-2007.

This tour will also be incorporated into a presentation at Open Repositories 2007 on Tuesday afternoon.

Directory Layout

The source directory tree has four major components: ‘lib’, ‘resources’, ‘src’, and ‘view’.

lib – libraries required by the application. The lib directory contains all of the JAR libraries required by the application. Its contents is a mix of the Seam-generated skeleton (pretty much everything at the top level of the ‘lib’ directory) and JAR libraries that are specific to the DRC application (in subdirectories of ‘lib’ named for the library in use). For instance, the ‘commons-codec-1.3’ and the ‘hibernate-all’ and the ‘jboss-seam’ JAR files were all brought into the project via ‘seam-gen’ while ‘lib/commons-net-1.4.1/commons-net-1.4.1.jar’ library was added specifically for this project. A convention has been established whereby new libraries added to the project appear as entries in the file which is used by series of directives in the build.xml file to setup the classpaths for compiling and for building the EJB JAR. This is done to make the testing and transition of new libraries into the application more explicit and easily testable. Note that the newly included library directory also includes a copy of any license file associated with that library; this is not only a requirement to use some libraries but is also a good practice to show the lineage of some of the lesser known libraries. (For an example of what is required, see the changes to build.xml and to in order to bring the Apache Commons Net library into the application.)

resources – configuration files and miscellaneous stuff. The resources directory holds the various configuration files required by the application plus other files used for testing and demonstration. Much of this was generated by the Seam-generated skeleton as well. Some key files here are the import.sql file (SQL statements that are used to preload the RDBMS used by Hibernate as the mocked up repository system) and the test-datastreams directory which has sample files for each of the media types.

src – Java source code. The src directory contains all of the Java source code for the application. Everything exists in a package called ‘edu.ohiolink.drc’ with subpackages for classes handling actions from the view component of the MVC, entity beans (sometimes known as Data Access Objects — or DAOs — I think), exception classes (more on this below), classes for working with FEDORA (not currently used), media type handler classes (more on this below), unit test classes (not currently used), and utility classes.

view – XHTML templates, CSS files, and other web interface needs. The view directory holds all of the files for the “view” aspect of the Model-View-Controller paradigm. More information about the view components is below.

Entity Classes

The entity beans package has three primary entity beans defined:,, and (The entity bean is not used at this time.) is the primary bean that represents an object in the repository. and are component beans that only exist in the lifecycle of an bean; holds a representation of a FEDORA object datastream and holds a representation of a Dublin Core datastream for that object.

The Datastream and Description objects are annotated with @Embedded in the source; this is Hibernate’s way of saying that these objects do not stand on their own. also has numerous methods marked with a @javax.persistence.Transient annotation meaning that this is information not stored in the backing Hibernate database; these methods are for the various content handlers, which will be outlined below.

Mock Repository

As currently configured, the entity beans pull their information from a static RDBMS using Hibernate rather than from an underlying FEDORA digital object repository. (You’ll need to go back to revision 691 to see how far we got with the FEDORA integration into JBoss Seam before we switched our development focus to the presentation ‘view’ aspects of the application.) As currently configured, Hibernate uses an embedded Hypersonic SQL database for its datastore. As part of the application deploy process, the Java EE container will instantiate a Hypersonic database and preload it with the contents of the import.sql file. (The import.sql file contains just three sample records at the moment: one each for a text file, a PDF file, and a graphic file.)

All of the data for a repository object is contained in a single table record. Hibernate manages the process for us of reading that record out of the database and creating the three corresponding Java objects: Item, Datastream and Description. (Hibernate could also handle the process of updating the underlying table record if we were to change a value in one of the Java objects.) The mapping of table column to Java object field is handled by the @Column(name="xx") annotations in the entity beans.

For Datastream, what is stored in the database is not the datastream content itself but rather a filename that points to the location of the datastream file. The file path in this field can either be absolute (meaning a complete path starting from the root directory of the filesystem) or a relative path. In the case of the latter, the path is relative to the deployed application’s WAR directory (something like “…/jboss-4.0.5.GA/server/default/deploy/drc.ear/drc.war/” for instance). Note that the getter/setter methods for the contentLocation are private — the rest of the application does not need to know the location of the datastreams; this will also be true when the DRC application is connected to a FEDORA digital object repository. The method marked public instead is getContent, and the implementation of getContent hides the complexity of the fact that the datastream is coming from a disk file rather than a FEDORA repository call. For the three records/repository-objects currently defined in ‘import.sql’ there are three corresponding demo datastreams in the test-datastreams directory.

In all likelihood, this representation of the FEDORA repository will be too simple for us to move forward much further. In particular, the current notion of one datastream per repository object is too simplistic. The Datastream embedded object will likely need to be broken out into a separate table and as a corresponding distinct Java applet. (We may reach the same point soon for the Description object as well.)

By using the Entity Beans as a buffer between the business logic and the view components of the rest of the application, I hope we can minimize/localize the changes required in the future in order to replace the mock repository with a real underlying FEDORA repository.

View Templates

The preferred view technology for JBoss Seam is Facelets, an implementation of Java Server Faces that does not require the use of Java Server Pages (JSP). Although the ‘.xhtml’ pages in the view directory bear a passing resemblance to JSP, behind the scenes they are radically different. Of note for us is the clean templating system used to generate pages. The home.xhtml file has a reference to the template.xhtml file in the ‘layout’ directory. If you read through the template.xhtml file, you can see where the Facelets engine will pull in other .xhtml files in addition to the content within the <ui:define name="body"> tag of home.xhtml.

Content Handlers

The paradigm of handling different media types within the DRC application is guided in large part by the notion of disseminators for FEDORA objects and the Digital Library Federation Aquifer Asset Actions experiments. The underlying concept is to push the media-specific content handling into the digital object repository and to have the presentation interface consume those content handlers as it is preparing the end-user presentation.

For instance, the DRC will need to handle content models for PDFs, images, video, and so forth. Furthermore, how a video datastream from the Digital Video Collection is offered to the user may be different than how a video datastream from a thesis is offered to the user. Rather than embedding the complexity of making those interface decisions into the front-end DRC application, this model of content handlers pushes that complexity closer to the objects themselves by encoding those behaviors a disseminators of the object. What the presentation layer gets from the object is a chunk of XHTML that it inserts into the dynamically generated HTML page at the right place.

There is work beginning on a framework for FEDORA disseminators at /BaseDisseminator/trunk in the source code repository; that work has been put on hold at the moment in favor of focusing on the presentation interface. In order to prepare for the time when the presentation behaviors are encoded as FEDORA object disseminators, the current presentation layer makes use of Content Handlers for each of the media types. The Handler interface defines the methods required by each handler and the TextHandler class, the ImageHandler class, and the PdfHandler class implement the methods for the three media types already defined.

Of these, TextHandler class is the most complete, so I’ll use it as an example.

  • The getRawDatastream method takes the datastream and sends it back to the browser with the HTTP headers that cause a File-Save dialog box to open.
  • The getFullDisplay method returns a chunk of XHTML that presents the full metadata in a manner that can be included in a full metadata display screen.
  • The getRecordDisplay method (currently unwritten) returns a chuck of XHTML used to represent the object in a list of records that resulted from a user’s search or browse request.
  • The getThumbnail method (currently unwritten) returns a static graphic thumbnail rendition of the datastream (e.g. a cover page, a key video frame, etc.).

By making these content handlers distinct classes, it is anticipated that the rendering code for each of these methods can be more easily moved to FEDORA object disseminators with minimal impact to the surrounding DRC interface application.

Exception Handling

The DRC application follows the practice suggested by Barry Ruzek in Effective Java Exceptions (found via this link on The Server Side). The article can be summarized as:

One type of exception is a contingency, which means that a process was executed that cannot succeed because of a known problem (the example he uses is that of a checking account, where the account has insufficient funds, or a check has a stop payment issued.) These problems should be handled by way of a distinct mechanism, and the code should expect to manage them.

The other type of exception is a fault, such as the IOException. A fault is typically not something that is or should be expected, and therefore handling faults should probably not be part of a normal process.

With these two classes of exception in mind, it’s easy to see what should be checked and should be unchecked: the contingencies should be checked (and descend from Exception) and the faults should be unchecked (and descend from Error).

All unchecked exceptions generated by the application are subclasses of DrcBaseAppException. (DrcBaseApplication itself is a subclass of RuntimeException.) For an example, see NoHandlerException. By setting up all of the applications exceptions to derive from this point, we have one place where logging of troubleshooting information can take place (although this part of the application has not been set up yet). Except when there is good reason to do otherwise, this pattern should be maintained.

At this point, no checked (or contingency) exceptions specific to the DRC have been defined. When they are needed, though, they will follow the same basic structure with a base exception derived from Exception.

The text was modified to update a link from to on January 19th, 2011.

The text was modified to update a link from to on January 20th, 2011.


p style=”padding:0;margin:0;font-style:italic;” class=”removed_link”>The text was modified to remove a link to on November 6th, 2012.

Picking a Java Web Application Framework

We’re beginning a new phase of our digital library development at OhioLINK and an oversimplification of one of the consequences of this new phase is that we will be developing more software from scratch rather than adapting stuff that we find out there on the net. (Another consequence of this new phase is our interest in applying the Service-Oriented Architecture paradigm to library applications.) In previous phases, we were somewhat at the mercy of whatever development framework was used in the application we were adopting. As we start this new development where we control more of our own destiny, we wanted to take a step back and look at the available frameworks to support our development efforts. The options we identified at the start were plain Java servlets, Apache Struts, Spring Framework, and EJB3 with JBoss SEAM.

Our analysis is admittedly by proxy rather than through direct experience. If we had more time, we would start our new phase in each one of these (with the associated learning curve for some of them) and pick the one that worked out the best. We have neither the luxury of time nor of excess developer talent, so we looked at what others were saying, created some sample applications, and discussed our gut reactions to each option. In the end, we decided to start with EJB3/SEAM.

Summary of Options

Over the years, the Java community has developed frameworks to support the creation of applications in the Java programming language. These frameworks consist of a combination of code libraries, programming rules, and best practices that have evolved based on the research and experience of the developer community. Just as almost no one designs, codes, debugs, and documents their own low-level file I/O libraries any more, these frameworks provide a level of abstraction over the raw programming language that enables developers to create better code faster.

Within the web application arena, arguably the earliest successful framework was Apache Struts. It was one of the first frameworks to promote a Model-View-Controller (MVC) architecture: the Model represents the business or database code, the View represents the page design code, and the Controller represents the navigational code. By creating these three layers, Struts promoted a practice of design and development away from commingling database code, page design code, and control flow code in the same JavaServer Pages files. In practice it was found that unless these three concerns are separated, larger applications become difficult to maintain.

The latest revolution in design architectures is Dependency Injection (DI). Sometimes also referred to as Inversion of Control (or IoC), it is a programming design pattern and architectural model in which the responsibility for object creation and object linking is removed from the code of the objects themselves. As the term “Dependency Injection” may imply, it is the framework that instantiates the component objects and binds them to each other through the use of setters and/or constructors rather than the components linking themselves together. In this pattern, the objects themselves become loosely coupled, allowing for a more dynamic and testable integration of the component objects. The creation and binding of objects is defined in an XML configuration file (Spring Framework) or via Java Annotations (EJB3).

Two primary frameworks have emerged for the Dependency Injection paradigm: Spring Framework and EJB3. The Spring Framework is the grandparent of subsequent IoC efforts (including EJB3). A de facto standard, it is widely used in the Java programming community with a large number of tools and documentation. EJB3 is a formal specification adopted by participants in the Java Community Process for standardizing Java-related developments. Coupled with the JBoss SEAM framework, the EJB3/SEAM combination is roughly on par with raw technical capabilities of the Spring Framework. SEAM removes some of the complexity of raw EJB3 by providing and preconfiguring popular choices for views (JSF) and models (Hibernate) as well as integrated AJAX-based Remoting and jPBM. Both promote the same MVC architecture as Struts.

The distinguishing characteristics are that Spring acts as an open integration tool at the expense of a complicated XML-based configuration process whereas EJB3/SEAM is simpler in its configuration and more restrictive in what other tools can be easily brought into the framework. In other words, Spring has a wide array of choices for the various framework components and for the most part the developer must manage the integration; EJB3/SEAM lessens the complexity of the framework by limiting the choices available for the various components to only the most popular options. Note that there is overlap between Spring and EJB3; the Pitchfork project allows for JSR-250 (Common Annotations) dependency injection and EJB 3.0 style interception as an add-on to the Spring framework. One should also note that a full EJB3 implementation is a kind of superset to SEAM, and if the needs of our applications go beyond that of SEAM it is possible to reduce our dependency on the SEAM framework by beginning to integrate other choices (and managing that integration ourselves).

OhioLINK staff would have a learning curve associated with both Spring and EJB3/SEAM (and Apache Struts as well, although it has been discounted as an option as a mostly dead-end architecture at this point). On the question of whether the learning curve is greater than the effort of doing things the “plain Java” way, one can point to the large number of developers who have made the leap to these frameworks and are arguably more productive for doing so. Since we are beginning the development of a new code base, it seems to be the ideal time to start up that learning curve with the creation and deployment of a new service. The learning curve might be easier for Spring than for EJB3/SEAM due to the wide variety of materials already produced for Spring, although the corporate backers of EJB3/SEAM seem to be filling this gap at a steady pace. The learning curve for EJB3/SEAM may be shallower, though, because SEAM simplifies the configuration of the framework itself.

At our development team meeting yesterday, we decided that OhioLINK will adopt a Dependency Injection (DI) paradigm over use of Apache Struts and plain Java servlets because of the anticipated large productivity gains after the learning curve. Of the two DI/IoP frameworks widely available now, we decided to adopt EJB3 coupled with JBoss SEAM. The standards-driven nature of EJB3 reduces the risk of adopting a technology that may not be supported in the medium-term. It is expected that JBoss SEAM will flatten the learning curve to make EJB3 more readily understood in the short term while providing a migration path to a broader EJB3 implementation (beyond the capabilities of SEAM) if needed in the future. The Spring Framework, to its credit, also makes it possible to envelop EJB3 objects as part of a Spring-based application, which could smooth the transition to that framework should it become necessary. We also understand that there is some risk of “vendor lock-in” by relying on JBoss SEAM — in our limited experience, applications seemed to deploy better under JBoss Application Server as opposed to a stock Apache Tomcat 5.5 installation. On the other hand (if the “standards” nature of EJB3 is to be trusted), it should be possible to move our code to other application servers, leaving SEAM behind, with minimal changes. We may also be able to further flatten the learning curve by buying a support contract with JBoss in the short- to medium-term.

Details about the Candidates

Enterprise Java Beans 3 (EJB3)

The EJB 3.0 framework is a standard framework defined by the Java Community Process (JCP) and supported by all major J2EE vendors. [The architecture of the Spring framework is based upon the Dependency Injection (DI) design pattern.] Open source and commercial implementations of EJB 3.0 specifications are already available from JBoss and Oracle. EJB 3.0 makes heavy use of Java annotations.

  • Based on standards work (the Java Community Process)
  • “Configuration by default” using Java Annotations backed by more complicated XML configuration files, if needed
  • A more compact, rigidly-defined framework stack; easier configuration, but fewer choices
  • Brand new (recently ratified); fewer tools, books, tutorials available


JBoss Seam is a powerful new application framework to build next generation Web 2.0 applications by unifying and integrating popular service oriented architecture (SOA) technologies like AJAX, Java Server Faces (JSF), Enterprise Java Beans (EJB3), Java Portlets and Business Process Management (BPM) and workflow.

Seam has been designed from the ground up to eliminate complexity at the architecture and the API level. It enables developers to assemble complex web applications with simple annotated Plain Old Java Objects (POJOs), componentized UI widgets and very little XML. The simplicity of Seam 1.0 will enable easy integration with the JBoss Enterprise Service Bus (ESB) and Java Business Integration (JBI) in the future.

  • Based on EJB3
  • Integrates together some of the most widely adopted components such as JavaServer Faces and Hibernate
  • Can reportedly be used outside of JBoss Application Server (direct experience show some problems with this)

Spring Framework

The Spring framework is a popular but non-standard open source framework. It is primarily developed by and controlled by Interface21 Inc. The architecture of the Spring framework is based upon the Dependency Injection (DI) design pattern. Spring can work standalone or with existing application servers and makes heavy use of XML configuration files.

  • Explicit configuration via XML file
  • More flexible than EJB3/SEAM in swapping in and out various modules; yet the decisions made early on limit the ability to swap in and out later.
  • Several years old; widely available tools, books, tutorials
  • De facto standard of an open source community with benevolent control by a corporate entity.

Apache Struts

Apache Struts is a free open-source framework for creating Java web applications.

  • (Summary opinion) Generally seen as a deprecated framework; few new projects start with Struts.
  • Weak integration of new techniques such as AJAX.

Plain Java Servlet (No Framework)

  • Must build the entire application support layer (object data storage, presentation interfaces, helper functions, etc. from scratch).
  • No ready-made integration of new techniques such as AJAX.

Background Information


p style=”padding:0;margin:0;font-style:italic;”>The text was modified to update a link from to on January 19th, 2011.