Data management support for researchers at the University of Glasgow

Today is my first full day on the ADMIRe project and I have been spending some time looking at the really useful and informative research data management website produced by the University of Glasgow. The University of Glasgow aims to develop its research data management capacity and capability and has produced a draft research data policy and a draft RDM roadmap. Both documents can be viewed here.

What is really interesting is that they are aiming to conform to EPSRC expectations regarding the management and provision of access to EPSRC-funded research data between May 2012 and May 2015.

Their pilots will be conducted within schools/divisions where EPSRC funding is most active and they say that:

“…the pilot period will provide us with detailed information regarding the range of operational and support-related costs and sustainability issues and will be used to inform wider testing of the Research Data policy and Roadmap across all Colleges and Schools.”

Finch Committee on Open Access

The Finch Committee was set up last year by the science minister David Willetts. The Finch Committee is examining how UK-funded research findings can be made more accessible.

Even though the focus is specifically journal articles, conference proceedings and monographs,  there is also some parallel work taking place relating to research data and other outputs being conducted by the Royal Society.

All the meeting notes  are made available on the website of the Research information Network (RIN). The working group met again last week and are due to report their findings later in the year. Definitely one to look out for.

DOIs for Research Data and RDM Training Materials

ADMIRe induction and #gettingtogrips

I will be starting full-time with ADMIRe next week, so this month has mainly been taken up with ADMIRe  induction meetings, project planning with Tom and Bill, allocating tasks from the work packages, preparing presentations, and desk-based research on research data management (RDM) and data information literacy (DIL).

Research Data Management Training Materials

I have been looking at a very wide range of topics/issues related to research data management and have found the outputs from the JISC Research data management training materials (RDMTrain) projects particularly useful. Tomorrow, Tom and I will be joining Wendy (Faculty Team Leader, Medicine and Health Sciences) and colleagues from the Graduate School to explore the potential of the Research Data MANTRA course . This course is designed for PhD students and others who are planning a research project using digital data.

MANTRA is an Open Educational Resource (OER) that may be freely used by anyone. It is available through an open license for re-using, rebranding, and re-purposing. MANTRA is one of the key outputs from the first phase of the JISCMRD programme and has been produced by EDINA and Data Library, a division of Information Services, University of Edinburgh. Further information on the project is available from here.

Today I did a presentation for the Information Literacy Development Group (ILDG) on the issue of RDM and data information literacy skills (DIL). For some useful information on the role of data information literacy and libraries, the presentations from the recent Research Libraries UK (RLUK) event  are all available here. This event aimed to clarify the research library agenda with regard to RDM.

DOIs for Research Data

I came across this interesting article today (full-text freely available), published in the May/June 2012 issue of the D-Lib Magazine:

Implementing DOIs for Research Data‘, Natasha Simons, Griffith University, Australia. Natasha concludes that implementing DOIs has “raised governance questions common to other institutions that encouraged discussion and collaboration.”

MRD Hack Day Manchester May 2012

Since my start with the ADMIRe project, I’ve been both impressed and terrified by the choice of research data solutions out there. There are: open-source projects, commercial offerings and bespoke institutional software and they all appear to roughly do the same thing in different ways. There are however, certain key functionalities which I believe a research data management system should have:

  1. Ability to store and retrieve research data
  2. Ability to store metadata
  3. Assigns a unique identifier to each data set
  4. Offers a workflow
  5. Handle security and access considerations
  6. Handle various data sets and files
  7. Be robust in terms of software and hardware architecture
  8. Be scalable

No doubt there are numerous other requirements, but these were some of the key functionalities I was looking to explore in greater detail at the JISC sponsored MRD Hack day in Manchester, 3rd -4th May 2012. From an ADMIRe perspective, we have a number of technical options that are based upon commercial products and infrastructure. So it was interesting to learn of relevant open-source software during the event and see developers working to implement solutions to problems.

As my interests were in requirements rather than coding, I chose to participate in the metadata working group. During the event we reviewed existing data schemas and outlined a schema that would allow interoperability between institutional repositories. This schema will be used within ADMIRe and the discussions around this subject, provided insight into the types of activities and functions ADMIRe will have to provide.

A real highlight for me was the concept of data papers from Brian Hole of Ubiquity Press. Captured metadata can be used to form a data paper that is searchable and most importantly, is citable via a Digital Object Identifier (DOI).This is very much the friendly-side of metadata and is one of the ways that ADMIRe should be presenting data to the end-user.

From a personal note, it was good to learn from Alex Ball about the DataCite API and actually mint a DOI for a test data set, something that is integral to the reuse aspect of research data.

All in all it was a valuable two days, with plenty of interaction between developers and non-coders alike.

New Staff for ADMIRe

I am delighted to say that ADMIRe now has two new members of staff.  Dr Tom Parsons has now started here as Project Manager and Laurian Williamson has started as Service Developer for the project.  Tom comes from a background in both bioscience and aerospace, with extensive commercial project management experience.  Laurian most recently comes from the JISC Repositories Support Project here at the  University and has a wealth of experience in online services, open access and repositories. Both Tom and Laurian can be contacted through the CRC.


Is “open data” internationally accelerating ?

The world of ‘Open data’ is not just an issue being hotly debated across the international stage but it seems to be accelerating.

In just the last month there has been… from Rome from South Africa in the US from Finland

as well as the major news about the Welcome Trust

What does all this mean ?

Perhaps it’s too early to say that there is a definite pattern but I would contend that the debate is maturing rapidly and it’s certainly an exciting time to be involved in this area. Especially when you don’t have to be a large organisation to have a voice, as demonstrated by the always entertaining (and informative)

RLUK RDM discussion day, 16.4.12

Here are some interesting references picked up at the above event yesterday.

Work in the USA towards data archiving

An interesting post on the JISC Repositories list on 21st March highlights a survey being undertaken by the University of Michigan looking at the relationship between data archives and institutional repositories. Their 10 minutes survey is available here for any that wish to contribute.

The poster also mentions a draft 12 page guide (available on Scribd, but easier to read as a Google doc) which is interesting and useful about building links between social science data archives and institutional repositories  ” . . . that provides guidelines and decision rules for institutional repositories at each stage of the archiving process: from appraisal to acquisition to curation to dissemination. ”


Open data

We’ve been talking today about the move towards open data and how we can draw upon our experiences of trying to deliver open access publishing. Experience from the open access work we have done at the University of Nottingham tells us that we need to take a long term view of this. The open access work has been on-going for nearly ten years and even now there is resistance to publishing using this methodology. Achieving similar results for open data may inevitably take just as long and potentially has bigger hurdles to overcome since researchers build careers upon their IPR and the data they generate and hold.

Is ownership of data much more ingrained into their personal USP as a researcher who brings value to an organisation than perhaps publications are? Inherent in publishing is a certain “letting go” that is accepted as part of the process of being an academic researcher. Is it the case that this does not necessarily exist in the psyche for datasets?

So in light of this we’re seeking to identify ways in which researchers are already “open” with their data. For example depositing in national archives at the end of a project. In the current mindset this might be a tick box towards “sustainability” in the funding bid, but can we re-purpose that thinking and turn it to “being open with data”?

Does that then simplify the process of creating the “local repository” (and supporting metadata) such that the entry describes the dataset and where it is held, linking off to the national repository? Perhaps that is a small additional step that is achievable beyond what the researcher is already doing and can be a catalyst towards change and more openness? If so, then does that local repository become part of the framework we are striving for in ADMIRe for us to build a process around? A quick retrospective trawl might help us to get a quick win and build such a repository to show its potential.

Open access started on a “build it and they will come” approach, and perhaps we need to do the same for open data?

ADMIRe Benefits

We have been thinking, along with other projects, of the possible benefits of the ADMIRe project and the larger framework of RDM development within the University, in which ADMIRe fits.

Many of the benefits are qualitative in nature, although we do expect solid returns in terms of research exposure, management and re-use.

Our current thinking – very much in draft form for now – can be found here.

We would welcome comments and reflections from others that are going down similar paths in identifying institutional and other benefits for their RDM programmes.


Data Classification

Within the research data management remit that the ADMIRe project will cover, my particular interest is research data security.  One aspect of data security that has been growing in significance in recent years has been data classification.  Without some sort of classification schema, it is difficult to define data security without an “all-or-nothing” approach. A classification schema allows security guidance or security policy for researchers to be more granular and directed at those who need it most – those holding the most sensitive data such as personal data (as defined by the Data Protection Act), health information and financial data.

The draft data classification schema being worked on at Nottingham currently defines four categories: Public, Internal, Confidential and Highly Confidential.

Discussions with colleagues at other universities suggest that there has been limited appetite for defining and rolling out data classification schemas. Given the scale of change usually required and the potential impact on organisations, that’s hardly surprising. However, they are  increasingly seen as a necessary step for moving institutions towards international standards for information security such as the ISO 27000 series.