Where are we with open research data infrastructures in the UK?
This post originally appeared on Jisc Research Data
Since the Royal Society published its ‘Science as an open enterprise’ report in 2012 and RCUK set out its research data policies a year earlier, the UK has progressed on the open data agenda, despite quite a tumultuous political period and lots of change. Whether we are moving fast enough or not is something that the Open Research Data Taskforce has investigated in its landscape report published last week.
The report is an excellent overview for anyone who is starting in this area and wants to be on top of the UK and global policies and infrastructure for open data. Even the more seasoned research data professionals will find something new, I am sure of it.
The report comprises 4 parts: policies, infrastructure, challenges and conclusion. If you are interested in what open data is, or how open data should be, or perhaps why share research data, then I would recommend starting with the Royal Society’s ‘Science as an Open Enterprise’, because the Open Research Data Taskforce report dives straight into the heart of policies and infrastructure. Below I selected a few interesting aspects from each part of the report.
Open Data Policies
Funders in the UK and Europe made headway on their research data policies. EPSRC for example has introduced a requirement for all its funded universities to have a data policy. More importantly, in 2016 a number of UK’s leading research organisations developed and signed the Concordat on Open Research Data.
Publishers have made little progress on developing and implementing research data policies. And the ones that led this effor, took their chance and actually tried to implement open data policies were sometimes met by backlash from the research community (remember PLOS?). So they will only move as fast as the researcher culture will allow it.
In Europe, Horizon 2020 programme introduced a data pilot. Furthermore, Research Data Alliance was launched in 2013 as a bottom up approach to develop research data infrastructure and share best practice. It is a truly global initiative and as many as 900 people attend the events out of more than 4,000 members that actively participate in the different subgroups.
Open Data Infrastructure
Probably the most notable development in this area is the collaborative European Open Science Cloud that aims to establish a trusted environment for research across the European borders.
Repositories are the core of open data infrastructure. In the UK specifically, whilst the number for institutional data repositories is still low (two dozen registered on re3data), some institutions use their publications repository for data as well and OpenDOAR records more than 200 in the UK. Other leading research organisations also offer generic or discipline-specific repository services. The most notable ones are the ESRC supported UK Data Service, the recent Wellcome Trust research data repository ran on figshare, Met Office’s DataPoint service. At Jisc, we are working with institutions and repository platforms to develop the research data shared service that aims to increase interoperability and design a flexible insitutional offering for research data management. Furthermore, to help with discoverability of the data published within institutional repository, we built the research data discovery service. ANDS in Australia have worked on a similar initiative.
The second recommendation of the Royal Society ‘Science as an Open Enterprise’ report directed institution in providing researchers with “knowledge resources and support” around data curation and other data management needs. Beside repository services, a number of institutions in the UK are leading providers of soft research data management support. However, as many are pointing out, there is still a real lack of skills in this area.
Challenges
Issues still persist around data policies. Neither the funders and universities, nor publishers have standardised their data language; nor are there adequate templates. The policy content is sometimes vague and non-prescriptive. Biggest challenge of course remains monitoring the implementation of policies.
In terms of infrastructure, the Task Force identified a number of challenges, but most critical are probably interoperability of systems and services, automation and software. When talking about data as infrastructure, the challenges lie around repurposing and reproducibility, quality of formats and metadata, and security. Cultural issues have not changed much since the Royal Society report was published and the same disciplines of astronomy and genomics continue to lead best practice and researcher behaviour around data.
Although a range of very good training programmes for open research data management have been developed by leading institutions, skills in this area are still lacking.
Governance was also a big challenge noted in the report, and the group welcomed the timely publication of the joint Royal Society and the British Academy landscape and recommendations for data governance in the 21st century.
What’s next?
Over the summer, the Taskforce will look at a number of disciplinary and more generic case studies of best practice and evident progress in open research data management within the UK and internationally. We will then develop a roadmap and recommend directions of travel for the UK within the open research data agenda. The final report is aimed for early 2018.