<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"><channel><title>Disruptive Library Technology Jester &#187; DSpace</title> <atom:link href="http://dltj.org/tag/dspace/feed/" rel="self" type="application/rss+xml" /><link>http://dltj.org</link> <description>We&#039;re Disrupted, We&#039;re Librarians, and We&#039;re Not Going to Take It Anymore</description> <lastBuildDate>Mon, 06 Feb 2012 20:04:22 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <cloud domain='dltj.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' /> <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/3.0/us/</creativeCommons:license> <item><title>Open Repositories 2011 Report: Day 3 &#8211; Clifford Lynch Keynote on Open Questions for Repositories, Description of DSpace 1.8 Release Plans, and Overview of DSpace Curation Services</title><link>http://dltj.org/article/or11-report-4/</link> <comments>http://dltj.org/article/or11-report-4/#comments</comments> <pubDate>Sat, 11 Jun 2011 04:16:28 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Meeting]]></category> <category><![CDATA[Clifford Lynch]]></category> <category><![CDATA[DSpace]]></category> <category><![CDATA[higher education]]></category> <category><![CDATA[Open Repositories 2011]]></category><guid isPermaLink="false">http://dltj.org/?p=3014</guid> <description><![CDATA[The main Open Repositories conference concluded this morning with a keynote by Clifford Lynch and the separate user group meetings began. I tried to transcribe Cliff&#8217;s great address as best I could from my notes; hopefully I&#8217;m not misrepresenting what &#8230; <a href="http://dltj.org/article/or11-report-4/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=3014"></abbr><p>The main <a href="https://conferences.tdl.org/or/index.php/OR2011/OR2011main">Open Repositories conference</a> concluded this morning with a keynote by <a href="http://www.cni.org/staff/clifford_index.html" title="CNI Staff: Clifford Lynch<br />1000">Clifford Lynch</a> and the separate user group meetings began.  I tried to transcribe Cliff&#8217;s great address as best I could from my notes; hopefully I&#8217;m not misrepresenting what he said in any significant ways.  He has some thought-provoking comments about the positioning of repositories in institutions and the policy questions that come from that.  For an even more abbreviated summary, check out this <a href="http://storify.com/datag/clifford-lynch-keynote-at-open-repositories-2011/" title="Clifford Lynch Keynote at Open Repositories 2011 - storify.com">Storify archive of tweets</a> during his keynote.  Then I attended the DSpace track of user group programming, and below there are summaries of plans for DSpace version 1.8 and the new DSpace Curation Services.</p><p><h2>Repositories: Major Progress and Open Questions</h2><br />Mindful that we are roughly a decade into building institutional repositories, Cliff said it was an appropriate time to look at what has been accomplished along with some of the open issues and new questions have emerged.  We still don&#8217;t have a good way to measure content in repositories.  People had radically different ideas from institution to institution on what is an &#8220;object&#8221; so those metrics don&#8217;t mean much; counting terabytes is equally fruitless because some repositories have video and others only have textual material.</p><p>Instead, the growth of repositories has highlighted critical policy discussions of what the missions of institutions of higher education are supposed to be.  Questions such as the responsibility to curate knowledge they create, curate the evidence on which inquiry is based, to disseminate knowledge.  These weren&#8217;t on the table 10-15 years ago. Now they are central issues for discussion in the leadership of universities.  Tied to institutional repositories is the question of open access.  The scope of issues goes beyond just open access, though.  It reaches into the kind of questions that are now getting traction like getting access to research data and an institution&#8217;s role of stewardship and disseminate of research data. Also the creation of open educational resources and institution&#8217;s responsibility to disseminate such resources.  These questions wouldn&#8217;t have emerged without the effort to build out institutional repositories.  As soon as you start talking about these questions they demand investments in infrastructure of institutional repositories.  So we should take satisfaction in the role that the efforts of deploying institutional repositories have played in advancing these critical policy discussions.  These questions have gone unanswered far too long.</p><p>That said, there is a danger of confusion of mechanism with policy.  We went through a bad period around 2004 when institutional repositories were deployed without knowing what would be put in them.  Institutional repositories are services to support policies, not ends to themselves.  We have mostly gotten past this and can have the discussion of whether institutionally based repositories are the appropriate tool and when should we build discipline-specific repositories or other kinds of platforms that are not institutionally focused.</p><p>He also noted that the question of institutional assets and the balance of faculty control with institutional responsibility is being talked about, if only quietly.  The piece of video that Clifford referred to &#8212; a talk by Derick Law on how universities have failed in their stewardship responsibilities of research &#8212; may be this video from the The Blue Ribbon Task Force on Sustainable Digital Preservation and Access&#8217;s <a href="http://brtf.vidizmo.com/MashupPlayBack.aspx?type=d&amp;id=HI3pHUmaP7o%3d" title="Conversations about Research Data and Scholarly Discourse video archive, National Conversation on the Economic Sustainability of Digital Information">National Conversation on the Economic Sustainability of Digital Information</a> (skip to &#8220;chapter 2&#8243; of the video) held April 1, 2010 in Washington DC.</p><p>Not only have institutional repositories acted as focal point for policy, they have also been a focal point for collaborations.  Library and IT collaborations were happening long before institutional repositories surfaced.  Institutional repositories, though, have been a great place to bring other people into that conversation, including faculty leaders to start engaging them in questions about dissemination of their work.  Also chief research officers; in 1995 if you were a university librarian doing leadership work constructing digital resources to change scholarly communication, would have talked to CIO but may not know who your chief research officer was at that point.  That set of conversations, which are now critical when talking about data curation, got their start with institutional repositories and related policies.</p><p>Another place for conversation has been those in the university administrations concerned with building public support for the institution.  By giving the public a deeper understanding of what the institution contributes to culture, industry, health and science, and connecting faculty to this effort.  This goes beyond the press release by opening a public window into the work of the institutions.  This is particularly important today with questions of public support for institutions.</p><p>That said, there are a number of open questions and places where we are dealing with works-in-progress.  Cliff then went into an incomplete and, from his perspective, perhaps idiosyncratic, list of these issues.</p><p>Repositories are one of the threads that are leading us nationally and internationally into a complete rethinking of the practice of name authority.  While it is a librarian, old fashion concept, but it is converging with &#8220;identity management&#8221; from IT.  He offered an abbreviated and exaggerated example:  librarians did name authority for authors of stuff in general in 19th century. In 20th century there was too much stuff, particularly stuff in journals and magazines became overwhelming.  So libraries backed off and focused only on books and stuff that went into catalogs; the rest they turned over to indexing and abstracting services.  We made a few weird choices like an authority file should be as simple as possible to disambiguate authors rather than be as full as possible, so we had the development of things along side name authority files like the national dictionaries of literary biographies.</p><p>For scientific journal literature, publishers followed practices about how obscure author names could be (e.g. just last name and first initial). Huge amounts of ambiguity of &#8220;telegraphic author names&#8221; results in a horribly dirty corpus of data.  A variety of folks are realizing that we need to disambiguate authorship by assigning author identifiers and somehow go back and cleanup the mess in the existing bibliographic data of scholarly literature, especially journal literature.  Institutions taking more responsibility for the work of their community, and having to do local name authority all over again. We have the challenge of how to reconnect this activity to national and international files.  We also have a set of challenges on whether we want to connect this to biographical resources.  It brings up issues of privacy, when do people do things of record, and how much else should come along with building a public biography resource.  We also see a vast parallel investment of institutional identity management.  Institutions haven&#8217;t quite figured out that people don&#8217;t necessarily publish with the same name that is recorded in the enrollment or employment systems that the institution manages, and that it would be a good idea to tie those literary names to identity files that the institution manages.</p><p>We&#8217;re not confident of the kind of ecological positioning institutional repositories among a pretty complicated array of information systems found at a typical large university.  Those systems include digital library platforms, course management systems, lecture capture systems, facilities for archiving the digital records of the institution, and platforms intended to directly support active research by faculty.  All are evolving at their own rate.  It is unclear where the institutional repositories fit, and what are the boundaries around them.</p><p>Here is one example.  What is the difference between an institutional repository and a digital library/collection?  You&#8217;d get very different answers from different people.  One might be who does the curation, how it is sourced, and how it is scoped.  The decisions are largely intellectual.  Making this confusing is that you&#8217;ll see the same platform for institutional repositories and digital library platforms.  We are seeing a convergence of the underpinning platforms.</p><p>Another one: learning management systems (LMS).  These are virtually universal among institution in the same timeframe that institutional repositories have been deployed.  We&#8217;ve done a terrible job at thinking about what happens to the stuff in them when the course is over.  We can&#8217;t decide if it is scholarly material, institutional records, or something else.  They are tangled up between learning materials and all of the stuff that populates a specific performance of a course such as quizzes and answers, discussion lists, and student course projects.  We don&#8217;t have taxonomies and policies here and a working distinction between institutional repositories and learning management systems.  It is an unusual institution that has as systematic export from the LMS to an IR.</p><p>Lecture capture systems becoming quite commonplace; students are demanding them in much the same way that the LMS was demanded.  A lecture capture system may be more universally helpful than an LMS.  Lectures being captured for a wide range of reasons, but not knowing why means it is difficult to know whether to keep them and how to integrate them into the institution&#8217;s resources.</p><p>Another example: the extent to which institutional repositories should sit in the stream of active work.  As faculty are building datasets and doing computation with them, when is it time for something to go into an institutional repository.  How volatile can content be in the repository?  How should repositories be connected or considered as robust working storage?  He suspects that many institutional repositories are not provisioned with high-performance storage and network connections, and would become a bottleneck in the research process.  The answers would be different for big data sets and small data sets, and we are starting to see datasets that are too big to backup or two big to replicate.</p><p>Another issue is that of virtual organizations, the kind of collaborative efforts that span institutions and nations.  They often allow relatively low overhead to mobilize researchers to work on a problem, and are becoming commonplace in sciences and social sciences and starting to pop up in the humanities.  We have a problem for the rules-of-the-road between virtual organizations and institution-based repositories.  It is easy to spin up an institutional repository for a virtual organization, but what happens to it when the virtual organization shuts down.  Some of these organizations are intentionally transient; how do we assign responsibility for a world of virtual organizations and map them into institutional organizations for long-term stewardship.</p><p>Software is starting to concern people.  So much scholarship is tied up now in complicated software systems that we are starting to see a number of phenomena.  One is data that is difficult to reuse or understand without the software.  Another is the is difficulty surrounding reproducibility &#8212; taking results and realizing they are dependent on an enormous stack of software and we don&#8217;t have a clear way to talk about the provenance of a result that is based on the stack of software versions that would allow for high-confidence in reproduction of results. We&#8217;re doing to have to deal with software.  We are also entering an era of deliberate obsolescence of software; for instance, any Apple product that is older than a few years is going to the dustbin and it hasn&#8217;t been fully announced or realized so that people can deal with it.</p><p>Another place that has been under-exploited is the question of retiring faculty and repositories. Taking inventory of someone&#8217;s scholarly collections and migrating it to an institutional framework in an orderly fashion.</p><p>How we reinterpret institutional repositories going beyond universities. For example there is something that looks a bit like an institutional repository but has some different things about it that belongs in public libraries or historic societies or similar.  This dimension bears exploration.</p><p>To conclude his comments he talked about a last open issue.  When we talk about good stewardship and preservation of digital materials, there are a couple of ideas that have emerged as we tried to learn from our past stewardship of print scholarly literature.  One of these principles is that geographic replication is a good thing; we&#8217;re starting to see this in a sense that most repositories are based on some geographically redundant storage system or we&#8217;ll see a steady migration towards this in the next few years.  A second one is organizational redundancy.  If you look at the print work, it wasn&#8217;t just that the scholarly record wasn&#8217;t in a number of independent locations but also that control was replicated among institutions that were making independent decisions about adding materials to their library collection. Clearly they coordinated to a point, but they also have institutional independence. We don&#8217;t know how to do this with institutional repositories. This is also emerging in special collections as they become digital.  Because they didn&#8217;t start life as published materials in many replicated versions, we need other mechanisms to have curatorial responsibility distributed.  This is linked to the notion that it is usually not helpful to talk about preservation in terms like &#8220;eternity&#8221; or  &#8220;perpetuity&#8221; or life-of-the-republic.  It is probably better in most cases to think about preservation in one chunk at a time; an institution making a 20-year or 50-year commitment with a well-structured process at the end.  That process includes whether an institution should renew the commitment and if not other interested parties could come in and take responsibility with a well-ordered hand-off.  This ties into policies and strategies for curatorial replication across institutions and ways that institutional repositories will need to work together.  It may be less critical today, but will become increasingly critical.</p><p>In conclusion, Cliff said that he hoped left the attendees with a sense that repositories are not things that stand on their own.  That they in fact are mechanism that advance policy in a very complex ecology of systems.  In fact, we don&#8217;t have our policy act together on many systems adjacent to the repository that leads to issues of appropriate scope and interfaces with those systems.  Where repositories will evolve to in the future as we understand the role of big data is also of interest.</p><p><h2>DSpace 1.8</h2><br />Robin Taylor, the DSpace version 1.8 release manager, gave an overview of what was planned (not promised!) for the next major release.  The release schedule was to have a beta last week, but that didn&#8217;t happen.  The remainder of the schedule is to have a beta on July 8th, feature freeze on August 19th, release candidate 1 published on September 2nd in time for the test-a-thon from the 5th to the 16th, followed by a second release candidate on September 30th, final testing October 3rd through the 12th, and a final release on October 14th.  He then went into some of the planned highlights of this release.</p><p>SWORD is a lightweight protocol for depositing items between repositories; it is a profile of the Atom Publishing Protocol.  At the current release, DSpace has be able to accept items; the planned work for 1.8 will make it possible to send items.  Some possible use cases: publishing from a closed repository to an open repository, sending from the repository to a publisher, from the repository to a subject-specific service (such as arXiv), or vice versa.  The functionality was copied from the Swordapp demo.  It supports SWORD v1 and only the DSpace XMLUI.  A question was ask about whether the SWORD copy process is restricted to just the repository manager? The answer was that it should be configurable.  On the one hand it can be open because it is up to the receiving end to determine whether or not to accept it.  On the other hand, a repository administrator might want to prevent items being exported out of a collection.</p><p>MIT has rewritten the Creative Commons licensing selection steps. It uses the Creative Commons web services (as XML) rather than HTML iframes, which allows better integration with DSpace.  As an aside, the Creative Commons and license steps have been split into two discrete steps allowing different headings in the progress bar.</p><p>The DSpace Community Advisory Team prioritized issues to be addressed by the developers, and for this release they include JIRA issue <a href="https://jira.duraspace.org/browse/DS-638">DS-638</a> for virus checking during submission.  The solution invokes the existing Curation Task and  requires Clam AV antivirus software to be installed.  It is switched off by default and is configured in submission-curation.cfg.  Two other issues that were addressed are <a href="https://jira.duraspace.org/browse/DS-587">DS-587</a> (Add the capability to indicate a withdrawn reason to an Item) and <a href="https://jira.duraspace.org/browse/DS-164">DS-164</a> (Deposit interface), which was completed as the Google Summer of Code Submission Enhancement project.</p><p>Thanks to Bojan Suzic in his Google Summer of Code project, DSpace has had a REST API.  The code has been publicly available and repositories have been making use of it, so the committers group want to get it into a finished state and include it in 1.8.  There is also work on an alternative approach to a REST API.</p><p>DSpace and DuraCloud was also covered; it was much the same that <a href="http://dltj.org/article/or11-report-1/">I reported on earlier this week</a>, so I&#8217;m not repeating it here.</p><p>From the geek perspective, the new release will see increasing modularization of the codebase and more use of Spring and the DSpace Services Framework.  The monolithic dspace.cfg will be split up into separate pieces; some pieces would move into Spring config while other pieces could go into the database.  It will have a simplified installation process, and several components that were talked about elsewhere at the meeting: WebMVC UI, configurable workflow, and more curation tasks.</p><p><h2>Introduction to DSpace Curation Services</h2><br />Bill Hays talked about curation tasks in DSpace.  Curation tasks are Java objects managed by the Curation System.  Functionally, they are an operation run on a DSpace Object and (optionally) its contained objects (e.g., community, subcommunity, collection, and items).  They do not work site-wide and not on bundles or bitstreams. The tasks can be run in multiple ways by different types of administrative users, and they are configured separately from dspace.cfg.</p><p>Some built-in tasks are to validate metadata against input forms (halts on task failure), count bitstreams by format type, virus scan (uses external virus detection service), on ingest (the desired use case), and the replication suite of tasks for DuraCloud.  Other tasks: link checker and 11 others (from Stuart Lewis and Kim Shepherd), format id with DROID (in development), validate/add/replace metadata, status report on workflow items, filter media in workflow (proposed), and checksum validation (proposed).</p><p>What does this mean for different users?  As a repository or collection manager, it means new functionality &#8212; GUI access without GUI development: curation, preservation, validation, reporting. As a developer: rapid development, and deployment of functionality without rebuilding or redeploying the DSpace instance.</p><p>The recommended Java development environment for tasks is with a package outside of <code>dspace-api</code>.  Make a POM with dependency on <code>dspace-api</code>, especially <code>/curate</code>.  Required features of the task are a constructor with no arguments to support loading as a plugin and that it implements the CurationTask interface or extends the AbstractCurationTask class.  Deploy it as a JAR and configure (similar to a DSpace plugin)</p><p>There are some Java annotations for Curation Task code that are important to know about.  Setting <code>@Distributive</code> means that the task is responsible for handling any contained DSpace objects as appropriate.  Otherwise the default is to have the task executed across all contained objects (subcommunities, collections, or items). Setting <code>@Suspendable</code> means the task interrupts processing when first FAIL status is returned.  Setting <code>@Mutative</code> means the task makes changes to target objects.</p><p>Invoking tasks can be done several ways: from the web application (XMLUI), the command line, from workflow, from other code, or from a queue (deferred operation).  In the case of the workflow, one can target the action of the task at anywhere in the workflow steps (e.g. before step 1, step 2, step 3 or at item installation).  Actions (reject or approve) are based on tasks results, and notifications are sent by e-mail.</p><p>A mechanism for discovering and sharing tasks doesn&#8217;t exist yet.  What is needed is a community repository of tasks.  For each task what is needed is: a descriptive listing, documentation, reviews/ratings, link to source code management system, and link to binaries applicable to specific versions.</p><p>With dynamic loading with scripting languages in <a href="http://java.sun.com/developer/technicalArticles/J2SE/Desktop/scripting/" title="Scripting for the Java Platform">JSR-223</a>, it is theoretically possible to create Curation Tasks in Groovy, JRuby, Jython, although the only one Bill has been able to get to work so far has been Groovy.  Scripting code needs a high level of interoperability with Java, and must implement the CurationTask interface.  Configuration is a little bit different: one needs a taskcatalog with descriptors for language, name of script, and how the constructor is called.  Bill demonstrated some sample scripts.</p><p>In his conclusion, Bill said that the new Curation Services: increases functionality for content in a managed framework; has multiple ways of running tasks for different types of users and scenarios; makes it possible to add new code without a rebuild; simplifies extending DSpace functionality; and with scripting lowers the bar even more.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/or11-report-4/feed/</wfw:commentRss> <slash:comments>11</slash:comments> </item> <item><title>Open Repositories 2011 Report: Day 2 with DSpace plus Fedora and Lots of Lightning Talks</title><link>http://dltj.org/article/or11-report-3/</link> <comments>http://dltj.org/article/or11-report-3/#comments</comments> <pubDate>Fri, 10 Jun 2011 03:58:20 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Meeting]]></category> <category><![CDATA[DSpace]]></category> <category><![CDATA[fascinator]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[microservice]]></category> <category><![CDATA[OCLC]]></category> <category><![CDATA[Open Repositories 2011]]></category> <category><![CDATA[premis]]></category> <category><![CDATA[simile]]></category><guid isPermaLink="false">http://dltj.org/?p=3004</guid> <description><![CDATA[Today was the second day of the Open Repositories conference, and the big highlight of the day for me was the panel discussion on using Fedora as a storage and service layer for DSpace. This seems like such a natural &#8230; <a href="http://dltj.org/article/or11-report-3/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=3004"></abbr><p>Today was the second day of the <a href="https://conferences.tdl.org/or/index.php/OR2011/OR2011main">Open Repositories conference</a>, and the big highlight of the day for me was the panel discussion on using Fedora as a storage and service layer for DSpace.  This seems like such a natural fit, but with two pieces of complex software the devil is in the details.  Below that summary is some brief paragraphs about some of the 24&#215;7 lightning talks.<br /><span id="more-3004"></span><br /><h2>Fedora inside DSpace</h2><br /><a href="https://profiles.google.com/mdiggory/about">Mark Diggory of @MIRE</a> moderated a panel of <a href="http://loomware.typepad.com/about.html" title="Mark Leggott">Mark Leggott</a> (Islandora, DiscoveryGarden and UPEI), <a href="http://bradmc.users.sourceforge.net/" title="SourceForge.net: Bradley McLean - Developer Web Hosting - Open Source Software">Bradley McLean</a> (CTO for DuraSpace), <a href="http://www.linkedin.com/pub/richard-rodgers/4/147/b2a" title="Richard Rodgers  | LinkedIn">Richard Rogers</a> (Head of Software at MIT Libraries), <a href="http://ryan.scherle.org/" title="Ryan's Home">Ryan Scherle</a> (Technical Lead, Dryad Digital Repository), and <a href="http://www.blogger.com/profile/13668812856976810177" title="Matt Zumwalt | Blogger">Matt Zumwalt</a> (MediaShelf; Technical Lead, Hydra) on the topic of &#8220;DSpace with Fedora Inside.&#8221;  At last year&#8217;s Open Repositories conference there was a call for the DSpace and Fedora communities to explore this idea.  All of the content and metadata would be stored in Fedora with DSpace continuing to provide the user interface for workflow, discovery and administration.  Or, viewed another way, retain the out-of-the-box experience of DSpace while exposing the versioning, object relationship, and flexible architecture features provided by Fedora.  Work on this has been going on for a number of years, starting in 2007 with Scott Yeadon demonstrating object portability between Fez/Fedora and DSpace.  In 2008, 2009 and 2010 there were three Google Summer of Code projects by Andrius Blažinskas that laid the groundwork for some of this integration by abstracting the DSpace storage layer options.</p><p>Here are some of the questions and responses from the panel.  I hope I&#8217;m representing everyone&#8217;s views as intended; comments and clarifications are welcome.</p><p>Will adding DSpace on Fedora make DSpace even more complex? <i>Matt:</i> It is an opportunity to revisit the design assumptions &#8220;and clean up your work.&#8221; This is an example of where the transition will create the opportunity to tidy up the complexity and make DSpace simpler while gaining the flexibility of Fedora. <i>Mark:</i> The complexity of DSpace isn&#8217;t in its content model, it is in trying to use the existing content model to do things DSpace wasn&#8217;t designed to do. For instance, Islandora has more complex atomistic content models, particularly with science data, than DSpace&#8217;s content model. <i>Bradley:</i> Acknowledged that there is a risk here, but &#8220;if it becomes more complex we are doing it wrong.&#8221;</p><p>There is concern from the DSpace because it may have to change to accommodate Fedora, but Fedora will not need to be changed. <i>Bradley:</i> New ideas are sources of concern.  It is difficult to categorize DSpace developers as a whole.  Any time you move a major application on top of another component you may find the underlying APIs need to change.</p><p>At what level do we align DSpace and Fedora? Do we really need an intermediary format (AIP)? <i>Bradley:</i> If all we do is find a way to graft the existing DSpace workflows onto a Fedora that is very specific to DSpace and can&#8217;t be used with other Fedora tools, then we haven&#8217;t moved very far.  The end goal is to find those places where DSpace is not formally specified enough and get it formally specified. <i>Mark:</i> Islandora hooks Drupal into Fedora. 80%-90% of the time we work with the Drupal-Fedora API &#8212; a simple PHP wrapper around Fedora API.  It transforms calls into appropriate actions into Drupal. Another option comes from an Italy project that created a synchronization of Fedora objects and Drupal Nodes; it copies information from Fedora into the Drupal RDBMS.  Other applications like Omeka have a Fedora plugin.  DiscoveryGarden has also looked at things like WordPress with Fedora underneath.  As repository services become more intelligent about microservices it would take even less time to make these integrations.</p><p>What benefit does Fedora receive? <i>Richard:</i> For the Fedora community, DSpace alignment would provide a rich IR content model for Fedora. <i>Ryan:</i> DSpace was designed as an IR and nothing else; it has that at its core.  The problems that people have with DSpace are when people try to make it do something outside that vision.  Having a Fedora repository and have a DSpace interface for those IR use cases and something like Hydra or Islandora for people using those use cases. <i>Matt:</i> There has never been a large cohesion of IR workflow in Fedora; having this workflow satisfy the IR use case. <i>Bradley:</i> DSpace with Fedora inside is a repackaging with a slightly different set of existing components.  DSpace across its lifetime has tried to become more modular; integration with Fedora will make this clearer. <i>Mark:</i> Would agree that one of the main things DSpace brings to Fedora is the workflow tool and also the back-end data transformation workflows.  But he has also never been a fan of the DSpace workflow because the staunch requirement to fill out a lot of metadata is a mistake.  Working with science data, researchers want to ingest 100K microscope images without metadata then go back and add metadata with time. <i>Ryan:</i> (Agreeing) Some of the requirements of the native installation of DSpace is difficult to work with in other use cases.  Started with configuring the workflow as much as could be done with config file, but then created a new workflow process that still used many of the underlying tools.</p><p>The way I think of the motivation is that DSpace on Fedora will have the same easy setup with access to the underlying APIs for customization. <i>Bradley:</i> Yes &#8212; that is an aspirational goal. The practical realities mean that we will have to take steps there one at a time.  And given the time scale the question comes whether we will get there before we decide to do something else.  One of them &#8212; sort of unsaid &#8212; is to take a look at EPeople and see how that would migrate. <i>Ryan:</i> Unlocking the data is one thing, but unlocking the underlying datastreams as a API.  DSpace storage API is opaque.  You can rebuild everything from the underlying storage. [Ryan also calls out an old DLTJ post: A key advantage of DSpace with Fedora inside. "<a href="http://dltj.org/p38">Why Fedora? Because You Don’t Need Fedora</a>".]</p><p><h2>Lightning Talks</h2><br />The other sessions that I went to today were of the quick 24&#215;7 type.  Here are some highlights.</p><p>Mark Phillips talked about a <strong>PREMIS Event Service</strong>.  He needed a way to log events that occur during the life of digital objects (virus checks, ingest, fixity check, replication) &#8212; when they occur and their outcomes. So he built a microservice based on AtomPub.  Each repository component sends outcomes of an event to this service, a central event collector.  Uses the PREMIS Event and Agent Modules.  Events metadata include: event_identifier, event_type, event_timestamp, event_outcome, outcome_detail, agent_identifier, object_identifier, and event_detail.  Agent metadata includes details about the software, human or organization triggering the event.  An AtomPub feed for each object returns all events for that object.  The system includes a basic search interface to see all events of a particular type and enables a feed to be set up on those searches.  There is the ability to harvest all events via OAI-PMH and Atom.  It is built with Django and Python and will be release on the MetaArchive Google Code page.</p><p>Peter Sefton talked about <strong>The Fascinator &amp; Fedora Commons: A Toolkit Tour</strong>. <a href="http://sites.google.com/site/fascinatorhome/" title="The Fascinator">Fascinator</a> is a java-based platform targeting repository solutions. It is open source (GPL), a plugin-based platform, and highly customizable.  It was first used as an aggregator of data from various sources into a discovery service.  They tried doing the same thing on the desktop computer (something a researchers could put on a personal computer, index data, group/describe it).  He thinks of the process as a conveyor belt: harvest digital objects, transform them, store them, index them and find them.  Harvest: draw into your digital ecosystem files, databases, online resources.  Transform: be ready to present and share.  Real web stuff &#8212; not PDFs.  Video and image previews. Multiple renditions for a multi-platform world.  Storage: Store originals and their friends; basic filesystem storage or use Fedora Commons.  Index: Apache SOLR index.  Find: Faceted search interface.  Web previews (turn Word into HTML and PDF for preview, same for video transcoding). Easily customizable UI (Jython and Velocity).  It is used by REDDBOX, Mint (described at a session yesterday), and a library of university policies (policies sent from Microsoft Sharepoint and transformed into HTML and PDF).</p><p>Rich Rogers talked about <strong>Publishing Large, Data-Rich Collections on the Web with Exhibit3</strong>. We collect and we like to share what we collect. Nowadays we live on the web, so how can we share out collections there? <a href="http://www.simile-widgets.org/exhibit/" title="Exhibit | SIMILE Widgets">Exhibit</a> is all I need to publish my collection to the web: no backend database, no server application; it will even convert a spreadsheet into usable data on the web.  Originally created by the SIMILE Project at MIT, Exhibit is an entire data publication platform with &#8216;list&#8217; and &#8216;tabular. views.  There is also a rich library of additional views. If temporal data, scrollable interactive timeline.  If geospatial data, interactive mapping displays.  Numerical data, scatter or time plots.  It is installable by HTML configuration and uses a browser-resident RDF database in JSON.</p><p>Geri Ingram and Carol Godby talked about <strong>Sustaining Collaboration Among Open Access Repositories</strong> focused on the <a href="http://www.oclc.org/gateway/" title="WorldCat Digital Collection Gateway  | OCLC">WorldCat Digital Collection Gateway</a>.  WorldCat broadens exposure to digital object collections; end users click through to the hosting repository server from WorldCat.org.  Metadata from OAI-PMH-compliant repositories are regularly harvested to WorldCat through the Gateway; digital objects remain on local repository server.  The gateway includes a translation service that allows repository managers to create mappings from Dublin Core (or selected other metadata schemas) to MARC.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/or11-report-3/feed/</wfw:commentRss> <slash:comments>7</slash:comments> </item> <item><title>Open Repositories 2011 Report: DSpace on Spring and DuraSpace</title><link>http://dltj.org/article/or11-report-1/</link> <comments>http://dltj.org/article/or11-report-1/#comments</comments> <pubDate>Wed, 08 Jun 2011 13:50:36 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Meeting]]></category> <category><![CDATA[DSpace]]></category> <category><![CDATA[Duracloud]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[Open Repositories 2011]]></category> <category><![CDATA[spring framework]]></category><guid isPermaLink="false">http://dltj.org/?p=2943</guid> <description><![CDATA[This week I am attending the Open Repositories conference in Austin, Texas, and yesterday was the second preconference day (and the first day I was in Austin). Coming in as I did I only had time to attend two preconference &#8230; <a href="http://dltj.org/article/or11-report-1/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=2943"></abbr><p>This week I am attending the <a href="https://conferences.tdl.org/or/index.php/OR2011/OR2011main">Open Repositories conference</a> in Austin, Texas, and yesterday was the second preconference day (and the first day I was in Austin).  Coming in as I did I only had time to attend two preconference sessions: one on the integration &#8212; or maybe &#8220;invasion&#8221; of the <a href="http://www.springsource.org/" title="SpringSource.org |">Spring Framework</a> &#8212; into <a href="http://www.dspace.org/" title="http://www.dspace.org/">DSpace</a> and one on the introduction of the <a href="http://www.duraspace.org/duracloud.php" title="NOW AVAILABLE: DuraCloud Open Source 0.7 | Duraspace">DuraCloud</a> service and code.<br /><span id="more-2943"></span><br /><h2>DSpace and the Spring Framework</h2><br />The Spring-Framework-in-DSpace was presented by <a href="https://profiles.google.com/mdiggory/about">Mark Diggory of @MIRE</a>.  He spoke from a DuraSpace wiki page set up as a <a href="https://wiki.duraspace.org/display/DSPACE/DSpace+Spring+Services+Tutorial">tutorial</a> on the topic.  In the first part of his presentation he introduced the inversion-of-control pattern, explaining why it is useful and showing how it works with simple code examples.  He then showed how a Spring-based ServiceManager can be integrated into the DSpace main code and then how new services can be plugged into that manager.</p><p>I came into the session more familiar with the Spring Framework than with the DSpace code, so I found the session to be a good introduction to some of the DSpace concepts even though I wasn&#8217;t the target audience.  (I imagine the target audience was someone familiar with the DSpace code wanting to learn about the Spring Framework.)  Thanks, Mark, for putting up the web tutorial and walking through it during the preconference session.</p><p><h2>DuraCloud Introduction</h2><br />The second preconference I went to was on the introduction of DuraCloud services from DuraSpace.  I can honestly say that I didn&#8217;t get what DuraCloud was supposed to be before, but seeing the about-to-be-released web interface I can say I think I finally get it.  DuraCloud is going to be both open source software and a service from DuraSpace that can back up a repository with storage, media access services, and compute/transformation services.</p><p>The session showed the web-based administration interface and the supporting tools for integrating a DSpace repository and a Fedora repository into DuraCloud.  Attendees were also given access to a command-line Java application that could be used to upload content into a DuraCloud instance, although sadly it wasn&#8217;t demonstrated during the preconference session.  (Perhaps I&#8217;ll try it out on the sly later with the DuraCloud credentials they gave us at the session&#8230;)  In addition to the functinality being built into DSpace and Fedora there will be REST-based code libraries for Java, PHP and Python &#8212; meaning that any developer could write code to make use of DuraCloud with any repository platform.  The whole DuraCloud application is going to be released into open source under the Apache 2.0 license as part of the efforts to create a community using the code and encourage others to write new services for DuraCloud.  This is something I&#8217;m going to keep watching; I&#8217;ve already signed up for a preview account for when the beta is released later this month.</p><p>Thanks also to DuraSpace for sponsoring the evening reception after the preconfernce session at the University of Texas Club.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/or11-report-1/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>OhioLINK is Seeking Two Senior Repository Software Developers</title><link>http://dltj.org/article/developer-search/</link> <comments>http://dltj.org/article/developer-search/#comments</comments> <pubDate>Thu, 15 Oct 2009 16:03:07 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[OhioLINK]]></category> <category><![CDATA[DSpace]]></category> <category><![CDATA[jobs]]></category> <category><![CDATA[scrum]]></category> <category><![CDATA[University System of Ohio]]></category><guid isPermaLink="false">http://dltj.org/?p=1316</guid> <description><![CDATA[My place of work, OhioLINK, is part of a larger group called the Educational Technology Division of the University System of Ohio. In that capacity, we&#8217;re seeking two senior repository software developers to work in our downtown Columbus, OH, office.The &#8230; <a href="http://dltj.org/article/developer-search/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/?p=1316"></abbr><p>My place of work, <a href="http://www.ohiolink.edu/" title="OhioLINK homepage" rel="homepage">OhioLINK</a>, is part of a larger group called the Educational Technology Division of the <a href="http://www.uso.edu/" title="University System of Ohio homepage" rel="homepage">University System of Ohio</a>.  In that capacity, we&#8217;re seeking two senior repository software developers to work in our downtown Columbus, OH, office.</p><p>The position description can be a little tricky to get to &#8212; the Ohio State University jobs website does not allow deep linking into job descriptions &#8212; so I&#8217;m reproducing the entire description here:</p><blockquote class="jobinfo"><table><tr><th class="tableColumnHeader" align="left" height="30">Position Information</th></tr><p></p><tr><td><input name="elementToConfigure" value="" type="hidden" /><table width="100%" border="0" cellpadding="3" cellspacing="1"><tbody><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_92">Number of Positions Available</label></span></td><td class="tableInLtShade" width="33%" height="30">1 &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_18">&nbsp;</label></span></td><td class="tableInLtShade" width="33%" height="30"><strong>Both current Ohio State employees and the general public may apply for this unclassified professional position.</strong> &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_19">&nbsp;</label></span></td><td class="tableInLtShade" width="33%" height="30">&nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_1">University Title</label></span></td><td class="tableInLtShade" width="33%" height="30">Senior Systems Manager-Not Sap &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_54">Working Title</label></span></td><td class="tableInLtShade" width="33%" height="30">Sr. Repository Developer &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_21">Department</label></span></td><td class="tableInLtShade" width="33%" height="30">Office of Research-OARnet &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_59">Department Location</label></span></td><td class="tableInLtShade" width="33%" height="30">Columbus &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_20">Requisition Number</label></span></td><td class="tableInLtShade" width="33%" height="30">347544 <i>[Also requisition number 347545]</i>&nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_2">Summary of Duties</label></span></td><td class="tableInLtShade" width="33%" height="30">Supports software development operations for Ohio Academic Resources Network (OARnet), in collaboration with the Chancellor for the Ohio Board of Regents (OBR) and the University System of Ohio (Education Technology Division), in accordance with university policies, goals, and objectives; participates in regular operation of SCRUM-based software development team; identifies project development requirements in conjunction with stakeholders, including Product Owner, community representatives and SCRUM team lead; develops technical solutions to meet the business objectives of product requirements in accordance with established OARnet/OBR software development standards; divides technical solutions into component-level features; assigns, monitors, and reviews component-level feature development, testing and integration tasks performed by team members; serves as technical SME for application development environment; performs investigation and tracking of industry trends and exploration of advanced technologies; serves as an expert consultant within and outside OARnet, and significant participation in advising and planning committees and task forces; designs, plans, and coordinates development/construction of systems; serves as a mentor to other development associates. &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_71">Additional Information</label></span></td><td class="tableInLtShade" width="33%" height="30">Successful completion of a background check required. &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_5">Required Qualifications</label></span></td><td class="tableInLtShade" width="33%" height="30">Bachelor&#8217;s Degree in computer &amp; information science or an equivalent combination of education and experience; extensive (5 years) Java development experience involving DSpace/Manakin, Cocoon, XML/XSLT and HTML/CSS site creation; considerable experience (3 years) with JBoss Application Server; considerable experience (3 years) with Linux/Unix, Perl, shell scripting; experience (1 year) with Log4j, JUnit, Maven and Apache Commons. &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_400">Desired Qualifications</label></span></td><td class="tableInLtShade" width="33%" height="30">Master&#8217;s of Library Science degree; experience with digital archive projects including metadata schema creation. &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_50">Target Salary</label></span></td><td class="tableInLtShade" width="33%" height="30">$76,000 &#8211; $84,000 Annually &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_22">Job Category</label></span></td><td class="tableInLtShade" width="33%" height="30">Information Technology (IT) &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_57">Job Appointment (FTE%)</label></span></td><td class="tableInLtShade" width="33%" height="30">100% &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_23">Full/Part Time</label></span></td><td class="tableInLtShade" width="33%" height="30">Full Time &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_72">Temporary or Regular</label></span></td><td class="tableInLtShade" width="33%" height="30">Regular &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_16">Posting Start Date</label></span></td><td class="tableInLtShade" width="33%" height="30">10-09-2009 &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_17">Posting End Date</label></span></td><td class="tableInLtShade" width="33%" height="30">10-25-2009 &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_77">Does this position accept online applications?</label></span></td><td class="tableInLtShade" width="33%" height="30">Yes &nbsp;</td></tr><tr align="left" valign="middle"><td class="tableInDeepShade" width="17%" height="30"><span class="subBodytext"><label for="di_79">Faculty Application Instructions</label></span></td><td class="tableInLtShade" width="33%" height="30">&nbsp;</td></tr></tbody></table></td></tr></table></blockquote><p>OARnet is a constituent of the University System of Ohio Educational Technology Division, and OARnet&#8217;s administrative agent is Ohio State University.  To apply for the position, go to <a href="https://www.jobsatosu.com/" title="Job Postings at Ohio State University">Ohio State University&#8217;s Job Site</a>, select &#8220;Search Postings&#8221; and use either requisition number 347544 or 347545.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/developer-search/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>JPEG2000 to Zoomify Shim &#8212; Creating JPEG tiles from JPEG2000 images</title><link>http://dltj.org/article/introducing-j2ktilerenderer/</link> <comments>http://dltj.org/article/introducing-j2ktilerenderer/#comments</comments> <pubDate>Thu, 28 Feb 2008 12:15:42 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[JPEG2000]]></category> <category><![CDATA[code4lib]]></category> <category><![CDATA[code4lib Conference 2008]]></category> <category><![CDATA[DSpace]]></category> <category><![CDATA[j2ktilerenderer]]></category> <category><![CDATA[java]]></category> <category><![CDATA[jpeg2000]]></category> <category><![CDATA[restlet]]></category><guid isPermaLink="false">http://dltj.org/article/introducing-j2ktilerenderer/</guid> <description><![CDATA[This is a textual representation of a lightning talk done on Feb 26th at Code4Lib 2008. When the video of the talk is up (thanks, Noel!) I&#8217;ll link it here, too. The video is now available, and that article includes &#8230; <a href="http://dltj.org/article/introducing-j2ktilerenderer/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/article/introducing-j2ktilerenderer/"></abbr><p>This is a textual representation of a lightning talk done on Feb 26th at <a href="http://code4lib.org/conference/2008" title="Code4Lib 2008 Conference Homepage">Code4Lib 2008</a>. <del datetime="2008-05-15T19:17:08+00:00">When the video of the talk is up (thanks, Noel!) I&#8217;ll link it here, too.</del> The video is <a href="http://dltj.org/article/jpeg2000-to-zoomify-lightning-talk-video/">now available</a>, and that article includes an update on progress since the this article was posted.</p><p>OhioLINK has a collection of JPEG2000 images as an access format that were generated for use in our <a href="http://dlxs.org/" title="Digital Library eXtension Service homepage">DLXS</a>-based content system.  We are in the process of migrating those collections to DSpace and were looking for a mechanism to leverage the existing JPEG2000 files and not have to generate new derivatives.  We are also considering the use of JPEG2000 as a preservation format, and would find it attractive to use the same image format for both access copies and preservation copies.  We looked at Zoomify, but to perform its scaling function it generates JPEG tiles at several resolutions and storing those tiles can triple or quadruple disk space requirements.  Or, one could use the &#8216;enterprise&#8217; version of Zoomify and its proprietary PFF format or the equally proprietary MrSID format.  We didn&#8217;t want to be locked into either of these scenarios.  Our solution is to create a web application that mimics the directory-of-JPEG-tiles solution, but to dynamically generate the tiles our of a JPEG2000 master.</p><p>The free version of Zoomify reads JPEG tiles out of a directory structure that looks like this:</p><table cellpadding="3"></table><tr><td style="white-space: nowrap;" valign="top">/ImageProperties.xml</td><td>Includes descriptive elements of the source image like height, width, and tile size.</td></tr><tr><td style="white-space: nowrap" valign="top">/TileGroup0/0-0-0.jpg</td><td>The highest power-of-2 zoom out level that creates an image with dimensions less than 256&#215;256</td></tr><tr><td style="white-space: nowrap" valign="top">/TileGroup0/1-0-0.jpg</td><td>The tile at the upper left corner at the first power-of-2 zoom level</td></tr><tr><td style="white-space: nowrap" valign="top">/TileGroup0/1-1-0.jpg</td><td>The tile to the left of 1-0-0.jpg</td></tr><p>The shim mimics that directory structure.  It parses the URL of the request and dynamically creates the appropriate JPEG tile (or metadata file) out of the JPEG2000 image.</p><p><h2>The Code</h2><br />The JPEG2000 for Zoomify shim requires <a href="http://java.sun.com/javase/downloads/" title="Java Download page">Java</a> 1.5 or greater.  It does not require a servlet engine; rather, it uses the <a href="http://www.restlet.org/" title="Restlet project homepage">Restlet</a> library to perform as a stand-alone application.  The <a href="http://one-jar.sourceforge.net/" title="OneJar project homepage">OneJar</a> library allows the Java classes and required dependencies to be bundled into a single JAR file.  We&#8217;re using the <a href="http://www.kakadusoftware.com/" title="Kakadu Software homepage">Kakadu Software JPEG2000 library</a> to perform the on-the-fly decoding of JPEG2000 images.  Kakadu is a commercial JPEG2000 codec, although <a href="http://www.kakadusoftware.com/index.php?option=com_virtuemart&amp;Itemid=19&amp;vmcchk=1&amp;Itemid=19" title="Kakadu Software purchasing and licensing guidelines">inexpensive licenses are available</a> for not-for-profit activity.  We are using the Enterprise version of <a href="http://www.zoomify.com/" title="Zoomify homepage">Zoomify</a>, a Flash-based image viewer, although I believe the free version will work as well.  (You&#8217;ll need the Enterprise version to be able to modify and adapt the appearance of the Zoomify applet.)  The same techniques can also be used for other Flash applets and probably even JavaScript-based viewers (<i>a la</i> Google Maps).</p><p>The source code is available from the <span class="removed_link" title="https://drc-dev.ohiolink.edu/browser/j2kTileRenderer/trunk">OhioLINK DRC source code repository</span> (<a href="https://drc-dev.ohiolink.edu/svn/j2kTileRenderer/trunk">Subversion access</a>).  We plan to integrate it into DSpace 1.5 as part of the <a href="http://info.drc.ohiolink.edu/" title="Ohio Digital Resource Commons | Save, Discover, and Share Your Resources and the Resources of the World">Ohio Digital Resource Commons</a>, and I may create a Fedora disseminator to serve up the tiles as well.</p><p>Thanks go out to Keith Gilbertson and John Davison on the OhioLINK staff for their help in making this work as well as Stu Hicks and François d&#8217;Erneville for being a sounding board for these ideas.<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to https://drc-dev.ohiolink.edu/browser/j2kTileRenderer/trunk on January 13th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://code4lib/conference/2008 to http://code4lib.org/conference/2008 on January 28th, 2011.</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://www.kakadusoftware.com/Purchasing.html to http://www.kakadusoftware.com/index.php?option=com_virtuemart&#038;Itemid=19&#038;vmcchk=1&#038;Itemid=19 on January 28th, 2011.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/introducing-j2ktilerenderer/feed/</wfw:commentRss> <slash:comments>13</slash:comments> </item> <item><title>Presentation Summary: &#8220;Cross-Repository Semantic Interoperability: the MIT SIMILE Project&#8221;</title><link>http://dltj.org/article/simile/</link> <comments>http://dltj.org/article/simile/#comments</comments> <pubDate>Mon, 29 Jan 2007 19:32:33 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Meeting]]></category> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[digital libraries]]></category> <category><![CDATA[DSpace]]></category> <category><![CDATA[icor2007]]></category> <category><![CDATA[metadata]]></category> <category><![CDATA[RDF]]></category> <category><![CDATA[semantic web]]></category><guid isPermaLink="false">http://dltj.org/2007/01/simile/</guid> <description><![CDATA[Richard Rodgers presented this talk based on the work of he and MacKenzie Smith in the Digital Library Research Group at MIT. The original abstract of the presentation was:Many questions are raised as previously unreachable digital content is found in &#8230; <a href="http://dltj.org/article/simile/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/01/simile/"></abbr><p>Richard Rodgers presented this talk based on the work of he and MacKenzie Smith in the Digital Library Research Group at MIT.  The original abstract of the presentation was:</p><blockquote><p>Many questions are raised as previously unreachable digital content is found in and among new repositories&#8211;is each repository an island or a separately searchable resource? SIMILE (Semantic Interoperability of Metadata and Information in Unlike Environments) has developed an extensive &#8216;tool chain&#8217; for gathering and manipulating data assets. Richard Rodgers and MacKenzie Smith, MIT, will demonstrate how tools developed by the SIMILE project can be used as powerful instruments for the federation, discovery, exploration, and curation of metadata.</p></blockquote><p>The mission of the <a href="http://simile.mit.edu/" title="SIMILE Project homepage">SIMILE suite of projects</a> is to build tools for data interoperability.  Dealing with heterogeneous metadata in repository design and use is a complex challenge, and the position that SIMILE takes is that no matter what single metadata scheme you select at the start of a repository project, one runs into trouble as subsequent collections come in with other semantically-rich collection-specific metadata schemes.  This puts the repository designer between a rock (semantic reduction and loss because metadata crosswalks are &#8220;lossy&#8221;) and a hard place (one has serious scalability problems &#8212; does one construct separate queries for each metadata schema &#8212; if all of the uniqueness of the metadata coming to the repository is embraced.</p><p>SIMILE uses RDF and other semantic web technologies contributing to the solution of heterogeneous metadata problem.  Statements about documents are inherently more mixable than the documents themselves, and RDF is a more mixable language than trying to harmonize metadata.  RDF represents data as a graph, not as a table (RDBMS) or tree (XML).  The tools created by SIMILE fall into four categories:</p><ul><li>Convert: RDFizers (for converting structured data to RDF, such as MARC into RDF), Babel</li><li>Visualize: Gadget (a data graph viewer for XML; it constructs all of the XPATHS in a document and projects them along with frequency of occurrences as a way to look at XML documents from a structural level), Welkin (same as Gadget except for RDF)</li><li>Browse: Longwell (see below), Piggy Bank (Firefox plugin; RDFizes an HTML page by using JavaScript to scrape metadata from websites and putting it into your personal repository), Semantic Bank (a way to publish RDF and  create communities of RDF content)</li><li>Lightweight UI:  Timeline, exhibit widgets (highly interactive faceted browse displays that divide the processing between client and server through the use of AJAX)</li></ul><p>Richard went into detail on Longwell, a faceted browser web application.  Using a RDF triple-store backend (Sesame), Longwell presents data in a configurable, extensible user interface.  One of the interesting technologies it uses is the W3C-defined Fresnel Display Vocabulary.  There is not anything equivalent to CSS in the RDF world to a layout styling language.  The W3C thought they could spur development of RDF tools if there was a way of expressing a display vocabulary in RDF, hence the Fresnel Display Vocabulary.  Longwell has been embedded into DSpace as an <a href="http://dspace.mit.edu/dwell/advanced-search" title="DSpace at MIT: Advanced Search">optional advanced search engine called &#8220;DWell&#8221;</a>.</p><p>Update at 20070129T1646 &mdash; also see Dorothea Salo summary:</p><p>Source: Caveat Lector » SIMILE<br />Address : <a href="http://cavlec.yarinareth.net/archives/2007/01/28/simile/" title="403 Forbidden">http://cavlec.yarinareth.net/archives/2007/01/28/simile/</a><br />Date Visited: Mon Jan 29 2007 16:43:35 GMT-0500 (EST)</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/simile/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>Open Source for Open Repositories &#8212; New Models for Software Development and Sustainability</title><link>http://dltj.org/article/open-source-for-open-repositories/</link> <comments>http://dltj.org/article/open-source-for-open-repositories/#comments</comments> <pubDate>Thu, 25 Jan 2007 04:37:10 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Meeting]]></category> <category><![CDATA[Sakai]]></category> <category><![CDATA[Unified Content Repository]]></category> <category><![CDATA[DSpace]]></category> <category><![CDATA[eprints]]></category> <category><![CDATA[Fedora]]></category> <category><![CDATA[higher education]]></category> <category><![CDATA[icor2007]]></category> <category><![CDATA[open source]]></category><guid isPermaLink="false">http://dltj.org/2007/01/open-source-for-open-repositories/</guid> <description><![CDATA[This is a summary of a presentation by James L. Hilton, Vice President and CIO of University of Virginia, at the opening keynote session of Open Repositories 2007. I tried to capture the esessence of his presentation, and omissions, contradictions, &#8230; <a href="http://dltj.org/article/open-source-for-open-repositories/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/01/open-source-for-open-repositories/"></abbr><p>This is a summary of a presentation by <a href="http://www.virginia.edu/vpcio/biography.html" title="http://www.virginia.edu/vpcio/bio.html">James L. Hilton</a>, Vice President and CIO of University of Virginia, at the opening keynote session of <a href="http://openrepositories.org/" title="Open Repositories 2007">Open Repositories 2007</a>.  I tried to capture the esessence of his presentation, and omissions, contradictions, and inaccuracies in this summary are likely mine and not that of the presenter.</p><p><h2>Setting the stage</h2></p><p>This is a moment in which institutions may be willing to invest in open source development in a systematic way (as opposed to what could currently be characterized as an <i>ad hoc</i> fashion) driven by these factors:</p><ul><li><strong>Fear</strong>. Prior to Oracle&#8217;s hostile take-over of PeopleSoft, the conventional wisdom of universities was that they needed to buy their core enterprise applications rather than build them.  In doing so, they sought the comfort of buying the security of a leading platform.  Oracle&#8217;s actions diminished that comfort level.  Blackboard acquisition of WebCT and lawsuit against a competitor does not help either.</li><li><strong>Disillusionment and ERP fatigue</strong>.  What was largely thought to be an outsourced project was found to be an endless upgrade cycle.  Organizations need to build entire support units to handle the upgrades for large ERP systems rather than supporting the needs of the users.</li><li><strong>Incredulity &#8212; we&#8217;re supposed to do what?</strong> The application of technology typically has a disruptive impact (cannot predict the end), the stakes are incredibly high (higher education and/or research could be lost in a decade), it tends to be expensive, and the most common survival strategy is to seed many expensive experiments in the hopes that one will be in the right place at the time the transition needs to happen.  The massive investment anticipated for technology to support academic computing (libraries, high-performance clusters, etc) will pale in comparison to the investment in administrative computing.</li><li><strong>Rising tide of collaboration</strong>.  This is a realization that the only way to succeed is through collaboration.  To paraphrase Hilton, &#8220;In the new order it will be picking the right collaborative partners where the new competitive advantage will come from.&#8221;</li></ul><p><h2>Distinctions</h2></p><p>Hilton offered these definitions and contrasts as a way to frame the rest of his discussion.  First was <strong>Open or &#8220;free&#8221; software</strong>.  Free as in beer, or free as in &#8220;adopt a puppy.&#8221;  The software comes with the ability to do with as you want with the code, not just the ability to use the code.  They he defined the term <strong>License</strong> as a contract &#8212; what ever you agree to you are bound to; you cannot use copyright law to protect you.  The rules and conditions that are applied to the software do matter.</p><p>Lastly, he talked about <strong>Copyleft or &#8220;viral&#8221;</strong> licensing.  There are different interpretations of &#8220;open&#8221; in open source.  &#8220;Copyleft&#8221; has come to mean that code should be freely available to be used and modified, and it should never by locked up.  GPL is an example.  This is often called &#8220;viral&#8221; because if you include software with this license in any other work that is released, the additional software must be released under the same license.  This is seen by some as valuable because it prevents open source from being encircled by proprietary code.  Copyleft is contrasted with an  &#8220;open/open&#8221; license &#8212; you can do whatever you want to do with a code under this license.  An &#8220;open/open&#8221; license places no restrictions on what users do with code in derivative software packages.</p><p><h2>Case Study &#8212; Michigan&#8217;s Sakai Sojourn</h2></p><p>Hilton briefly described why UMich went down the Sakai path in 2001-2002:</p><ul><li>Legacy system with no positive trajectory forward.  It could never be released into open source; all of the development would have to be carried on UMich&#8217;s shoulders forever.</li><li>Saw market consolidation in CMS.  This was mostly evident in the commercial sector with Blackboard and WebCT being the dominant choices.  They had concerns about the cost of licenses in this environment down the road.</li><li>Saw the potential of tapping the institution&#8217;s core competencies and starting a virtuous cycle of development, teaching and research.  Or, put another way, they didn&#8217;t want core competencies in teaching and research held hostage to a commercial development cycle.</li><li>Strategic desire to blur the distinction between the laboratory/classroom and between knowledge creation/digestion.  They realized that the functions of a research support tool and a course support tool were pretty much the same under different skins, and they sought to blur that distinction even more.</li><li>NRC report and the need for collaboration.  UMich was willing to fund the project two years internally but knew after that need to find collaborative partners by the fifth year in order to be declared a success.</li><li>A moment of time opportunity that synchronized the development process of several partners with funding provided by the Mellon Foundation.</li></ul><p>There were also specific goals for the Sakai project.  The new system had to replicate the functionality of existing course and research collaboration environments.  They also wanted experience in finding partners willing to collaborate.  Hilton said, &#8220;Sakai was/is at least as interest from a collaboration perspective as it is from the technology perspective.&#8221;  Bringing together disparate organizations with different beliefs on how things should be done is a challenge.  Additionally, they wanted to get better as an institution at discerning open source winners; it shouldn&#8217;t be like a lottery.  Lastly, they wanted to implement software parts that were not built at UMich.  Each partner institutions committed to implementing the same thing even if wasn&#8217;t built at that institution.  This is tough to do, but they knew they needed to do it for their own good in the long run.</p><p>What happened?  Not only did the original partners show up, but the community came, too.  Even more interesting was that the community was formed with dues-paying members &#8212; even in a world where the software is free.  It became a vibrant community, too, with a conference every six months.  Sakai was released under an open-open license model, and corporate partners showed up as well (selling support services, or hosting services, or hardware for the software).  The software did grow up and left its home; a separate foundation now holds the intellectual property of the code (originally partners assigned copyright to UMich).  They also positioned Sakai to be a creditable threat to the commercial entities in order to force them to the standards table.</p><p><h2>Takeaway lessons that generalize to open source development</h2></p><p>First, the benefits of open source development.</p><ul><li>destiny control (but only when you really need to drive).  having the control is not always a good thing. Is it worth the effort?  Is the project core to the institution&#8217;s mission?  (Does it directly support scholarship and teaching?)</li><li>builds community and camaraderie (in the case of Sakai, both locally at UMich and internationally)</li><li>unbundles software ownership and its support.  inspires more competition in the implementation and support space.</li><li>community source provides institutions an opportunity to leverage links between open source, open access and culture of the academy/wider world (a.k.a. put up or shut up)</li></ul><p>Then, the challenges of open source development.</p><ul><li>Guaranteeing clean code (IP) is hard (read as &#8220;impossible&#8221;).  A certain amount of faith about the code they get and there needs to be consideration for mitigating risks.</li><li>Figuring out who is authorized to license institutionally-owned code is challenging and then you have to convince them to give it away.  No one in the institution typically has been appointed or given the authority to release code.  One of the things that the sakai licensing discussions highlighted was institutional differences in requirements and aesthetics.</li><li>Patent quagmire always looming.  How do you know your software is not infringing?  How do you make sure you don&#8217;t inadvertently give away all institution patents?  Be careful when looking at licenses from an institutional perspective versus an individual perspective.</li><li>There is also the inevitable lawsuit risk.  Or, as your counsel might say to you, &#8220;Let me get this straight, we can get sued but there&#8217;s no one we can sue.&#8221;</li></ul><p>Then, some discoveries that they made along the way.</p><ul><li>An open source project not a silver bullet.  The commitment to build rather than buy must align with institutional priorities and competencies; it is not right for every project/application.</li><li>Licensing does matter; it is a contract:  whatever you stick in its rules is what sticks.  There are probably have too many open source license options and some sort of standardization is needed.  Also keep in mind that if you release something under an open/open license, you can&#8217;t include any copyleft components.</li><li>Communities don&#8217;t just happen, they require:  specific shared purpose (when visions vary, or when they change, collaborations struggle); and governance (e.g., separate board with dedicated developers sitting between institutions).  Cooperation (&#8220;I won&#8217;t hurt you if you don&#8217;t hurt me&#8221;) is not collaboration.</li><li>Open (community) source requires real project discipline.  &#8220;It is as spontaneous as a shuttle launch.&#8221;  Along the way one needs to learn to balance pragmatics and ideals.  One also needs to learn to trust your partners.  &#8220;It really requires learning to let go.&#8221;  Letting go, and having the community make the decisions, may be the quickest path to efficiency.</li></ul><p><h2>Reflection on open/community source for repositories</h2></p><p>Repositories are at the center of everything at the institution.  It connects with the library, with the presses/scholarly publishing operation, with classroom teaching, with the laboratory, and with the world.  It is a core piece of of infrastructure for the university of the 21st century.  As institutions, we need to make sustaining investments in our repositories.</p><p>Hilton sees three different approaches to &#8220;community&#8221; in the existing projects:</p><ul><li>dspace:  community of user/developers.  The come together to talk about what they want to do, write code, and support each other.  Clearly there are enthusiastic users as developers.</li><li>eprints: appears as like a vendor talking with customers wanting the community help shape the direction.</li><li>fedora: in transition from a combination of the previous two models moving towards a Sakia-like model. it will require institutions to make commitments to it.</li></ul><p>In the end, Hilton asked some thought-provoking questions. Is now the time for institutional investment in open/community source?  Will a coherent community (or communities) emerge in ways that are sustainable? &#8212; is there a shared vision?</p><p style="padding:0;margin:0;font-style:italic;">The text was modified to update a link from http://www.virginia.edu/vpcio/bio.html to http://www.virginia.edu/vpcio/biography.html on January 19th, 2011.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/open-source-for-open-repositories/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> <item><title>Heads up!  International Conference on Open Repositories (01/23/07 &#8211; 01/27/07, San Antonio, TX, US)</title><link>http://dltj.org/article/icor20007/</link> <comments>http://dltj.org/article/icor20007/#comments</comments> <pubDate>Tue, 04 Jul 2006 01:28:22 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Fedora]]></category> <category><![CDATA[Meeting]]></category> <category><![CDATA[DSpace]]></category> <category><![CDATA[eprints]]></category> <category><![CDATA[icor]]></category> <category><![CDATA[icor2007]]></category> <category><![CDATA[open source]]></category><guid isPermaLink="false">http://dltj.org/2006/07/heads-up-international-conference-on-open-repositories-012307-012707-san-antonio-tx-us/</guid> <description><![CDATA[Open Repositories 2007 is coming up next year, and it looks to be an interesting meeting. The first day is open user group meetings for DSpace, Fedora, and Eprints, followed by general conference sessions that cover issues that cut across &#8230; <a href="http://dltj.org/article/icor20007/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2006/07/heads-up-international-conference-on-open-repositories-012307-012707-san-antonio-tx-us/"></abbr><p><a href="http://openrepositories.org/" title="301 Moved Permanently">Open Repositories 2007</a> is coming up next year, and it looks to be an interesting meeting.  The first day is open user group meetings for DSpace, Fedora, and Eprints, followed by general conference sessions that cover issues that cut across all of the open repository systems. This year, the user groups will partition their programs into Plenary, Technical Issues, and Management Issues and the partitions will be staggered so that IT managers can attend all plenary sessions, technical staff can attend all technical sessions, etc.</p><p>The <span class="removed_link" title="http://openrepositories.org/call">call for participation</span> for the general conference has gone out.  Its Program Committee is seeking submissions in the form of an extended abstract of no more than 500 words by October 2, 2006. The contributions must be written in English and should be double spaced. The Program Committee will select relevant submissions. Selected speakers will receive an email by November 6, 2006 with guidelines for their presentation. Presentations will be limited to 20 minutes, plus 10 minutes for questions.</p><p>This looks to be a really good meeting.  You can track it on HitchHikr at <a href="http://hitchhikr.com/index.php?conf_id=75" title="" class="broken_link" rel="nofollow">http://hitchhikr.com/index.php?conf_id=75</a>.<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://openrepositories.org/call on October 29th, 2010.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/icor20007/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> </channel> </rss>
<!-- Served from: dltj.org @ 2012-02-11 09:38:20 by W3 Total Cache -->
