Since my start with the ADMIRe project, I’ve been both impressed and terrified by the choice of research data solutions out there. There are: open-source projects, commercial offerings and bespoke institutional software and they all appear to roughly do the same thing in different ways. There are however, certain key functionalities which I believe a research data management system should have:
- Ability to store and retrieve research data
- Ability to store metadata
- Assigns a unique identifier to each data set
- Offers a workflow
- Handle security and access considerations
- Handle various data sets and files
- Be robust in terms of software and hardware architecture
- Be scalable
No doubt there are numerous other requirements, but these were some of the key functionalities I was looking to explore in greater detail at the JISC sponsored MRD Hack day in Manchester, 3rd -4th May 2012. From an ADMIRe perspective, we have a number of technical options that are based upon commercial products and infrastructure. So it was interesting to learn of relevant open-source software during the event and see developers working to implement solutions to problems.
As my interests were in requirements rather than coding, I chose to participate in the metadata working group. During the event we reviewed existing data schemas and outlined a schema that would allow interoperability between institutional repositories. This schema will be used within ADMIRe and the discussions around this subject, provided insight into the types of activities and functions ADMIRe will have to provide.
A real highlight for me was the concept of data papers from Brian Hole of Ubiquity Press. Captured metadata can be used to form a data paper that is searchable and most importantly, is citable via a Digital Object Identifier (DOI).This is very much the friendly-side of metadata and is one of the ways that ADMIRe should be presenting data to the end-user.
From a personal note, it was good to learn from Alex Ball about the DataCite API and actually mint a DOI for a test data set, something that is integral to the reuse aspect of research data.
All in all it was a valuable two days, with plenty of interaction between developers and non-coders alike.