Choosing the right archiving platform

Posted on 6th May 2014 by Patrick Coyle

As a web designer and developer, my role in the Accounts of the Conflict project is to identify a suitable repository platform to manage the stories that we are collecting and archiving.

There are a few systems that I have thoroughly tested so I will go through the benefits of each as well as give my opinion on what we decided to be best suited for our particular project.

EPrints http://www.eprints.org/

The first platform that I tested is EPrints, which was developed by the University of Southampton, and is widely used throughout the UK in particular, but has quite a large user base worldwide.

EPrints is a free, PERL based, feature-rich digital repository platform providing a vast amount of flexibility in terms of the files that can be stored and the information (metadata) that can be recorded.

Whilst this is probably what we are looking for in terms of file and information storage for long term preservation, there are some difficulties that I came across in terms of customisation of the system to provide the modern feel to how we foresee our final project website being presented.

The design of the EPrints system itself is very complex, making it quite difficult to present files and metadata in a certain way that is different from what the developers offer as default. That being said, it is not impossible to customise and there is quite a large selection of technical documentation and training materials available. The developers have also created EPrints bazar which is almost like a mini free app store with some useful add-ons to improve the system from its default form.

DSpacehttp://registry.duraspace.org/

The next platform I tested was DSpace, which is very similar to EPrints in terms of functionality but uses a slightly different approach to the management of files and metadata.

It is an open source, Java based system, with installation being a bit more complex than EPrints, however it is easier to manage and maintain.

Like EPrints it offers a repository environment for the long-term preservation of files and metadata with plenty of options available to describe every archived item in detail, plus as it is Java based there are many Java API’s that can be used to interconnect DSpace with other systems with minimal effort.

Both of these systems offer some very advanced specific search options to allow the users to find exactly what they are looking for, with EPrints probably coming out on top with the options it provides.

OMEKAhttp://www.omeka.org

OMEKA is a php based open source platform that acts similarly to a content management system rather than a repository and was developed by Roy Rosenzweig Center for History and New Media at George Mason University. Museums and similar institutions frequently use it for compiling digital collections and exhibitions.

It is very easily set up and managed and is highly customisable, with almost every aspect of the public facing output customisable to the developer’s requirements.

There is a large collection of documentation to help tailor the system and understand how it works, as well as a very well managed support forum with developers and fellow users to help where the documentation may not.

Like most CMS platforms there are a variety of extensions that can be added to the default installation to improve the capabilities of the system, with one in particular allowing communication between repositories using the OAI standards.

Unlike most CMS platforms, OMEKA provides a solid foundation on which to archive digital media and records with Dublin Core metadata included as standard plus a capability to add endless fields of custom metadata for many item types should they be required.

We have chosen OMEKA as the base for our archive due to these customisation options. However, we plan to use EPrints as a backup repository to ensure the long-term preservation of the item metadata. EPrints also caters for the academic user while OMEKA is best suited for developing a product that can be used by any casual internet user. Accounts of the Conflict is aimed at both user categories.

The opinions outlined in the post are based on what has appeared to be best suited for our particular needs on this project. My suggestion to anyone reading this would be to try the systems and see what works best for you or your institution.

Patrick Coyle