Data management and upcoming conferences

First, we would like to announce that we will be participating Archiving 2014 conference at Berlin in May. The OSA project has two papers: Flexible Data Model for Linked Objects in Digital Archives by Mikko Lampi (MAMK) and Olli Alm (ELKA – Central Archives of Finnish Business Records) and Micro-services Based Distributable Workflow for Digital Archives by Heikki Kurhinen (MAMK/Otavan Opisto) and Mikko Lampi (MAMK). The first paper is about the data model designed during the Capture project. The model is implemented in OSA software and is further developed until the end of the project. The main technologies behind the model are the Fedora object model and RDF. The workflow paper is about the software developed by Heikki as his bachelor thesis. It is designed to be very simple and able to be integrated with any software.

Here is the latest development news. We started implementing mass modification features for managing the object metadata before and after the ingest. Mass update is very useful for describing the batches of files before ingesting them. Adding common metadata about the owner or the origin can help the ingest and management processes. But we found that it is not that simple to modify descriptive metadata after the initial ingest. Therefore, in the upcoming beta version, the mass updates are available only for files in the workspace before ingesting. We will continue to develop the archive mass updating during the spring.

Much of the Fedora content models have been refactored to contain only the absolute minimum data required to understand and manage the objects. The information about forms and organization specific mappings and such were removed from the archive because they are only views or interpretations of the data and not the data definition itself. This will make the design more consistent and allow organizations and users to have much better customization features.

After the last posting, we found out that the current GSearch solution doesn’t support the latest Solr, which we required because of the Finnish language support. This has been resolved with a new release of GSearch. Because our Solr and Fedora are installed in separate servers, we cannot use GSearch’s reindex functionality. I did some initial testing with SSHFS and NFS for connecting the servers but this approach is not very sustainable and network or server errors can cause index desynchronization. We will develop a module to keep track on the sync status and perform reindexing as needed.



About Mikko Lampi

I'm an open-minded technologist interested in data, design, software, business and a better future. I work as a Research Manager at the Mikkeli University of Applied Sciences.
This entry was posted in Data management, Fedora Commons, Software development and tagged , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s