Separating Configuration from Code in CollectionSpace

 Posted on 
 ·  3 minutes reading time

For the past few months I've been working on a project to migrate a museum's collection registry to CollectionSpace. CollectionSpace is a "free, open-source, web-based software application for the description, management, and dissemination of museum collections information." ((From the answer to the first question of the CollectionSpace frequently asked questions.)) CollectionSpace is multitenant software -- one installation of the software can serve many tenants. The software package's structure, though, means that the configuration for one tenant is mixed in with the code for all tenants on the server (e.g, the application layer, services layer, and user interface layer configuration are stored deep in the source code tree). This bothers me from a maintainability standpoint. Sure, Git's richly featured merge functionality helps, but it seems unnecessarily complex to intertwine the two in this way. So I developed a structure that puts a tenant's configuration in a separate source code repository and a series of procedures to bring the two together at application-build time.

CollectionSpace Tenant Configuration Structure
CollectionSpace Tenant Configuration Structure
There are three main parts to a CollectionSpace installation: the application layer, the services layer, and the user interface layer. Each of these has configuration information that is specific to a tenant. The idea is to move the configuration from these three layers into one place, then use Ansible to enforce the placement of references from the tenant's configuration directory to the three locations in the code. That way the configuration can be changed independent of the code.

The configuration consists of a file and three directories. Putting the reference to the file -- application-tenant.xml -- into the proper place in the source code directory structure is straightforward: we use a file system hard link. By their nature, though, We cannot use a hard link to put a reference to a directory in another place in the file system. We can use a soft link, but those were problematic in my specific case because I was using 'unison' to synchronize the contents of the tenant configuation between my local filesystem and a Vagrant virtual system. (Unison had this annoying tendency to delete the directory and recreate it in some synchronization circumstances.) So I resorted to a bind mount to make the configuration directories appear inside the code directories.

To make sure this setup is consistent, I use Ansible to describe the exact state of references. Each time the Ansible playbook runs, it ensures that everything is set the way it needs to be before the application is rebuilt. That Ansible script looks like this:

Some annotations:

  • Lines 12-18 create the hard link for the tenant application XML file.
  • Handling the tenant configuration directories takes three steps. Using the application configuration as an example, lines 20-24 first make sure that a directory exists where we want to put the configuration into the code directory.
  • Next, lines 26-34 uses mount --bind to make the application configuration appear to be inside the code directory.
  • Lastly, lines 35-41 ensures the mount-bind lasts through system rebuilds (although line 33 makes sure the mount-bind is working each time the playbook is run).

Then the typical CollectionSpace application build process runs.

  • Lines 89-120 stop the Tomcat container and rebuilds the application, services, and user interface parts of the system.
  • Lines 122-133 start Tomcat and waits until it is responding.
  • Lines 135-163 log into CollectionSpace, gets the session cookie, then initializes the user interface and the vocabularies/authorities.

I run this playbook almost every time I make a change to the CollectionSpace configuration. (The exception is for simple changes to the user interface; sometimes I'll just log into the server and run those manually.) If you want to see what the directory structure looks like in practice, the configuration directory is on GitHub.