This Tuesday I travelled down with Steve Loddington of the Research Support Office, to King’s College London’s Waterloo Campus to attend a Ready4Ref (R4R) CERIF workshop. Following an overview of the day from Mary Davies (Kings) the day proper began. What follows are my notes and comments on the sessions, hopefully slides from the event will be available online shortly.
CERIF4REF, Richard Gartner, Kings College
CERIF standard is very complex, almost too complex for most users to fully understand. The RAE2008 was used to shape CERIF as REF2012 standards remain as of yet unannounced. Repositories and CRIS outputs on systems that adhere to the standard can be feed through CERIF4REF and provide a single output to the REF assessors (in principle). There is a data dictionary that defines the standard and the elements within it. Going to take RAE data from Kings and process as part of a trial to check that this works well.
CERIF, CRISes & Research Databases Marc Cox
Marc talked about King’s CRIS, developed in house 2004-7, and developed originally as a research management tool; although the RAE overtook and drove it towards administrators rather than academics which was not the original intent. Took data from HR, student, awards and finance & publications from TR WoS (author ID a problem) – now use the WoK API to take data, although that was quite a challenge. At the moment administrators (mostly) and academics (few) are keeping the publications up to date.
CERIF is a standard data model that describes research entities and their inter-relationships, originally developed with support of EU. It is architecture independent. 4 main data fields from RAE2008 taken for CERIF4REF. A number of system and data tweaks were needed to these four research data fields to make it compatible with CERIF. RA1 data was relatively easy, although RA2 data was more difficult to map. RA3a/b and RA4 couldn’t be mapped without the base data which created them.
Benefits from the approach however included RAE forms generated from style sheets that can be cross compared with php scripts to check accuracy. Next steps are to generate real King’s data in CERIF xml format and exchange data with other CERIF compliant systems.
Using ISI Web of Science Data in Repositories, Les Carr
EPrints has had plug ins that do this for a while on an individual basis, but due to change in licenses now have access to API for direct deposit. SWORD based ISI Deposit, for EPrints was examined, although as Les noted the technology wasn’t at the heart of the issue as all repositories work in a similar fashion in the big picture. There is a need for a repository editorial step – which is a manual step, so can be like drinking from a fire hose – too much data flooding in and how can you deal with it with established workflows. The data download may not be straight forward exercise, e.g. student papers and non-peer-reviewed items are listed on WoS as well as academic papers. Les showed an example of selecting one academic and the process to go through to weed out the non-relevant items (a manual process) – 38/items ingested initially a minute, about 10 minutes for manual process and removal of duplicates and irrelevant. Questions of how to use this – monthly update? On a per user basis?
Les moved on to look at repositories as a CRIS – since repositories manage research or teaching or academic outputs and are broader in description and purpose. But what about other databases and information resources across campus (Finance, HR, Grants database. CRISes pull all the disparate data together and present a unified view of it; which includes the repository. Eprints has attempted to accommodate the CERIF data – not just publications but projects and organisations.
E.g. previously a project was added in the metadata – now they are objects in their own right, linking from metadata record to a page about the project itself; with contributors rather than authors. Data can be exported and imported in CERIF format. This joined up integrated resource can help develop research case studies for demonstrating impact and output. I imagine useful though this is, it does add yet another load to the already busy repository administrators workflows. However, I can see a significant advantage to the repository that offers this kind of joined up service. I doubt Leicester will go this route, given our interest in a separate CRIS systems at the heart of the research management agenda.
Discussions
After a brief Q&A session we moved onto lunch. After lunch we broke into two discussion groups, one looking at the perceived benefits or flaws in CERIF; along with the practicality of auditing and standardising institutional systems with it. These sessions then reported back on the points that had been raised. Notably on average for those in attendance having data information systems that could be audited and made compatible with the CERIF standard was a reasonably attractive opportunity, however there were mixed concerns on the technical expertise being available in house at short notice to participate. When it came to staff resource available to take part in such an audit, virtually the entire group felt that this was the biggest obstacle to overcome.
Overall this was an interesting day, and while it was more on the CERIF data standard than the REF itself as I had hoped I was still able to take away some points for further thought.
[Edit: Slides from the event are now available here: http://www.kcl.ac.uk/iss/cerch/projects/portfolio/r4r.html]