JISC MRD Workshop thoughts

The ADMIRe team had an entertaining few days at the JISC MRD workshop on the 24th-25th October 2012. It’s very interesting to learn from our fellow travellers and see where our commonalities lie.

Some highlights for us were:

  • Evaluating which data to retain and discard, the NERC checklist was particularly interesting.
  • Using the JANET brokerage to lever resources and explore shared RDM services
  • Training and guidance – there were some useful examples of RDM websites
  • RDM business cases – Bristol are far ahead here

The final point is one that is looming ever larger in our work, we’re a year into the project and the focus has shifted from one of analysis to implementation – a noticeable change that has required a shift in both mindset and working patterns. Our RDM website is due to launch in November and our RDM survey results are also due for release – both deliverables that will engage the academic community and generate interest in our fledgling service. In this respect, we’re working hard to understand what the University requires and what roles, skills and departmental structures are required to support RDM requests.

Clearly, we expect the website to generate enquiries, but do we signpost them to other people (i.e. Library, IT Support etc) or engage with them ourselves? these are questions which others at the conference are faced with and many are at a similar point – is launching a website equivalent to launching a full-scale service with limited support (i.e. only the JISC funded team) and is this a wise move?

If anything, our progress to date is making it exceptionally clear that now is the time to take our thoughts on sustainability and to turn this into a structured business case – so expect updates on this as we progress.

ADMIRe presentations and posters are available here:

ADMIRe JISC Workshop Poster

University of Nottingham RDM policies

RDM@nottingham training event

Last week I was invited to give a two hour workshop/presentation on research data management at the University of Nottingham (UoN) Academic Librarians’ Forum (ALF). This forum meets regularly to discuss wider LIS issues and topics relevant to their role in supporting the researchers’ at UoN.

An integral part of the ADMIRe project is to identify the RDM training needs of both our research community and those that will be providing services offering research data management support. A key aspect of raising RDM awareness at UoN is the delivery and organisation of RDM training, advocacy and outreach. This was a great opportunity to gather some initial thoughts and views on how the academic librarians’ saw the future of a sustainable RDM service, and in particular the skills that they may already have on managing information, as well as finding information.

The title of the event was ‘What is research data management?’ and the event organiser provided me with a series of RDM topics to cover during the session. The aim of the event was to raise awareness of research data management (RDM) and identify some of the key skills required for the delivery of a research data management service. The event and user feedback from the event will inform and enhance the development of the RDM service at the University of Nottingham.

We had 12 attendees and had two interesting break-out activities, one was around the RDM skills matrix ADMIRe has been working on and the other was reviewing the recently published: ‘Ten recommendations for libraries to get started with research data management’, published in August 2012 by the LIBER (Ligue des Bibliothèques Européennes de Recherche – Association of European Research Libraries).

Activity one – RDM skills matrix

The RDM skills matrix includes several key elements of the research lifecycle and attendees where asked to identify where they think library staff could provide support on a variety of RDM issues. The majority agreed that they already had the skills in the following areas:

  1. Metadata
  2. Open Access and Repositories
  3. Data discovery and data re-use
  4. Compliance with funding policies and requirements
  5. Data classification

Some of the areas where they felt they needed further training included:

  1. Data types
  2. Data storage
  3. Data preservation
  4. Data archiving
  5. Data Management Plans

Continue reading

Notes from the 2nd Datacite Workshop

Tom Parsons and I attended the 2nd DataCite Workshop at the British Library Conference Centre on July 6th, which proved to be an excellent opportunity to compare notes with other institutions working on incorporating the DataCite metadata schema into their workflows.

Caroline Wilkinson has already written a report on the Workshop, and the slides from the Workshop are available. So rather than repeat that information, here are the notes I made on points raised during the day which seemed particularly relevant to our current work at the University of Nottingham – hopefully there will be something here that’s helpful to others as well.

DataCite Mandatory Metadata

  • Many metadata schemas exist; it’s advisable to choose or define one that meets your specific needs
  • “Title” should always be different from the article title: it’s the title of the dataset
  • When listing “Creators” (authors) in DataCite, it’s important to also define their roles and IDs
  • “PublicationYear” should be the date of public availability
  • “Publisher” should be the data center or archive making the data available.
  • “ResourceType” is currently being considered as a mandatory, rather than an optional field
  • Citation suggestion: Creator (Year): Title. Publisher. Identifier.

Subject-Specific Metadata

  • There are a large number of additional subject-specific metadata schemas in use
  • eg: Data Documentation Initiative – Standard for statistical and social science data (v 3.1 released in 2009)
  • Some datasets have huge numbers of contributors (eg genetics) where the list of contributors is itself a large dataset
  • For geospatial data, geographical extent is a crucial metadata item, which can be surfaced in landing pages as an embedded Google Map

Protocols and Standards

  • Bristol are providing serialisation using RDF/XML, and using SWORD as the repository deposit protocol
  • DC2AP – A DataCite Dublin Core Application Profile is in development
  • DataCite2RDF – Maps DataCite metadata to RDF
  • ISO 19101 – Deals with subsets of data
  • XForms – “XML format for the specification of a data processing model for XML data and user interface(s) for the XML data, such as web forms”
  • WAF – Web Accessible Folder

Useful Software

  • Bristol have used Apache Tika to extract metadata from data files
  • OrbianForms – XForms-compliant web form builder available in a free open source Community Edition
  • Ex Libris Rosetta – “highly scalable, secure, and easily managed digital preservation system”
  • Ex Libris Primo – “one-stop solution for the discovery and delivery of local and remote resources, such as books, journal articles, and digital objects”

Miscellaneous

  • A “Schematron” validates content as well as conformance to XML schema

ADMIRe RDM survey at the University of Nottingham

The more we become embedded with all things research data management (RDM) at the University of Nottingham the less time we seem to have to update this blog with our ADMIRe JISCMRD activities. I know how beneficial I find all the JISCMRD blog postings, especially learning from some of the projects which are at a more advanced stage than ours, so hopefully this posting will provide you with some idea of the work we have been doing.

July was a really busy month, so this is the first in a planned series of updates of some of our key activities that the ADMIRe team have been focusing on recently.

Research Data Management Survey

As Tom outlined in his blog posting earlier this month our research data management survey (using the Bristol Online Survey tool) was launched and will be open until mid September.We currently have 196 responses from researchers across all faculties. UoN is a research-intensive university with more than 2500 career researchers (excluding PhD researchers).

Our survey is aimed at all UoN researchers (including PhD researchers) and we wanted to discover how data is used and managed across the University. Requirements gathering on RDM is a key activity for us, we aim to deliver a sustainable RDM service which will facilitate and embed good RDM practice at UoN.

We will publish the survey results (anonymised) once they have been analysed, sometime during the Autumn. Some  interim results are as follows:

  1. 85% of respondents are creating or working with documents (txt, pdf, Word etc)
  2. 32% back-up their data daily
  3. 59% do not record or document any metadata about their data
  4. 66% work on externally funded projects
  5. 26% developed a RDM plan for their project
  6. 92% had not received any RDM training
  7. 129/196 respondents wanted to receive training in developing a RDM plan
  8. 49.0% said their research data was confidential to their research group
  9. 30% said they were unsure whether they were required to make their data publicly discoverable and accessible after the project closed
  10. 40% said they would not deposit their data in a subject/discipline specific respository and 48% weren’t sure

Plenty of interesting responses thus far for us to mull over. Tomorrow I will provide an update on our work with DAF and sensitive data, our planned RDM website, and other training and RDM awareness training activities.

 

Event report: research data management and the role of libraries

On Tuesday I attended the excellent joint JIBS/RLUK event ‘Demystifying Research Data: don’t be scared be prepared’, held at the Brunei Gallery, SOAS, London.The event was aimed at subject/liaison librarians, key stakeholders who are likely to become increasingly involved in supporting research data management (RDM) activities as institutions start to develop their RDM policies and services. This event really did help in raising awareness of RDM and considered the roles that librarians have in delivering a robust RDM infrastructure and service within a University environment.

The programme was a good mix of presentations and breakout group sessions and I left the event with the feeling that RDM is certainly a hot and topical issue amongst university library staff challenged and engaged with the whole issue of RDM.

All the presentations and notes from the breakout sessions will be made available on the JIBS website, so I will just blog about some of the highlights I took away from this event. Definitely worth having a look at all the presentations once they are made available.

Michael Day from UKOLN gave a thorough overview of the importance of RDM and outlined how until recently there was no consistent way of managing research data in universities. Increasingly research bodies are becoming stricter in what they expect from the research they fund and managing research is important because it enables data re-use, ensures research integrity, improves research impact, and enables UK HEIs to fulfill any regulatory requirements.

He stressed the importance of buy-in from senior management on the necessity for good RDM practice and also to remember that RDM is the shared responsibility of both the institution and the researcher.

When it comes to the institutional drivers for effective RDM practice, two were continually mentioned throughout the day, by several presenters and in the breakout sessions:

  1. Compliance with funding mandates and policies
  2. EPSRC expectations and their Roadmap 2012 – compliance is essential by 2015

Liz Holliday presented on the UWE JISCMRD project and she gave a personal reflection on future librarian roles in RDM and why librarians are, or should be, involved. Liz’s presentation can be viewed here.

Rachel Proudfoot from the University of Leeds presented on the JISCMRD RoaDMaP project which is assessing data management requirements in a number of different subject disciplines and at different stages of the research application process (pre-award, live award, and post-award). She talked about current RDM capacity at Leeds and how important it is to ’embed’ RDM as part of normal university practice.

Continue reading

RDM survey

Last week we launched our RDM survey across the University using the Bristol Online Survey tool.

So far the results are interesting, we’ve had 140 responses and the majority of people seem to be working with standard research files e.g. Word, PDF, Excel or software generated files.

Expect a full update when the survey closes.

 

First ADMIRe pilot in the Classics Department

We are pleased to announce that ADMIRe is now working with The University of Nottingham Classics Department. This promising pilot examines the storage and citation of large data sets and general research data.

When we first approached the department, we were surprised to learn that the size of their data sets is on a par with the sciences. They regularly use specialist equipment to scan and model statues and sculptures in museums, the resulting files are 1-3Gb and are stored and backed-up locally. So as well as providing feedback on our proposed data store solution, they will also be one of the first departments to test our prototype data file store service.

As well as large data, the collaboration gives us the opportunity to raise awareness of ADMIRe and JISC RDM by linking to an AHRC project on Digital Transformations in Arts and Humanities. Here our specialist knowledge and expertise on RDM can really add value to their project and strengthen the outputs of both ADMIRe and the Digital Humanities project.

More updates will follow as the pilot progresses.

Open access to research outputs

Two key publications have been made available this week, both of which are of interest to the ADMIRe project team. Firstly we had the highly awaited publication of the Finch Report: “Accessibility, sustainability, excellence: how to expand access to research publications” . This 140 page publication presents the findings of the Working Group on Expanding Access to Published Research Findings, chaired by Dame Janet Finch. The report recommends a programme of action which will enable more people to read and use the publications arising from research. The report makes ten recommendations and outlines the key actions necessary in order to implement the recommendations of the working group. An executive summary is available and the report has had some interesting media coverage this week, including in the Guardian and the BBC.

The Royal Society today published their substantial report “Science as an open enterprise: open data for open science” which:

“highlights the need to grapple with the huge deluge of data created by modern technologies in order to preserve the principle of openness and to exploit data in ways that have the potential to create a second open science revolution.”

The report highlights six key areas for action, and these include:

  • Scientists needing to be more open amongst themselves and with the public and media
  • Greater recognition for the value of data gathering, analysis and communication
  • Common standards for sharing information in order to make data widely usable
  • Publishing data in a reusable form to support findings must be mandatory
  • More experts in managing and supporting the use of digital data are required
  • New software tools need to be developed to analyse the growing amount of data being gathered

The report includes some interesting case studies of data use and the costs of digital repositories.

It will be interesting to see the impact that both these publications have on academic scholarly communications and opening up access to research outputs (both publications and data).

 

Data citation, sharing data, and RDM at Nottingham

It has been a pretty hectic couple of weeks for Tom and I filled with meetings with key University of Nottingham staff from different departments and divisions all whom are keen to facilitate and deliver good and effective research data management (RDM) practice at our institution. We have identified and contacted academics from all five faculties (Arts, Engineering, Medicine and Health Sciences, Science, and Social Sciences) to take part in our phase one RDM pilots and we have also given plenty of thought to what we would like the University of Nottingham RDM website to contain and offer our research community. We are also working on a RDM@Nottingham survey which we hope will inform the development of the ADMIRe project.

Since my last blog post I have also attended some interesting external events including the excellent DataCite workshop at the British Library which covered topics such as how to mint a DOI (Digital Object Identifier), why making research data available and citable is important, and the challenges there are with citing research data. All the presentations from the day are available here.

I also attended the Repositories Support Project one day event on scholarly communications and new developments in open access in London on the 01st June. It was held at the stunning Art Deco venue the Royal Institute of British Architects (RIBA) and the programme showcased some great case examples of innovative approaches supporting data sharing, open access to research outputs and an open approach to scholarship.  Videos and presentations from the event are all available here.

Ethics, consent and data sharing – for anyone interested in this area of RDM I would definitely recommend listening to the recording of the Webinar delivered by Margaret Henty of the Australian National Data Service in April. She considers the myths around data sharing, meeting funding bodies obligations, informed consent, access control, and the importance of incorporating data sharing into research planning.

Also published this week is the Council on Library and Information Resources publication “How does big data change the research landscape for the humanities and social sciences?”. The full-text publication and associated press release is available here.

Linking peer-reviewed literature to associated datasets

OpenAIREplus is a large-scale EU project bringing together 41 pan-European partners, including three cross-disciplinary research communities. OpenAIREplus aims to:

“…create a robust, participatory service for the cross-linking of peer-reviewed scientific publications and associated datasets.”

The 30 month project launched in December 2011 (see Bill’s post on this launch) and on the 11th June they will be presenting an OpenAIREplus workshop in conjunction with the Nordbib Conference 2012 Copenhagen, June 11-13, 2012 . The OpenAIREplus workshop “Linking Open Access publications to data – policy development and implementation” looks really interesting with a very exciting programme and I am hoping they will make the workshop presentations and outputs available after the event.

The workshop is aimed at anyone with an interest in this topic, and will be of interest to library managers, researchers, research funders, repository managers, journal editors and publishers, and research administrators. Topics covered include:

  • Preparing and writing institutional data management policies
  • An overview of funder’s responsibilities and requirements towards data availability and management
  • An overview of linking research publications and data
  • The research data landscape

Follow developments and news items on the OA EU infrastructure on Twitter @OpenAIRE_eu

Links of interest

OpenAIREplus press release

International conference: Structural frameworks for open, digital research – strategy, policy & infrastructure

OpenAIRE