Symplectic User Group Conference 2011

Posted by gazjjohnson on 25 May, 2011

This Tuesday saw me down in London once again (a whole 4 days since my last trip down for CILIP Council) for the Symplectic User Conference 2011 at Hamilton House – so here are my notes – apologies for any typos as I was typing these on my knee! 

The day was split in two with talks in the morning and workshops in the afternoon.  Daniel Hook kicked off the day by announcing that Symplectic had partnered with Digital Science to work in the open science community.

Hamilton House, London - venue for the dayThe first speaker was Lorna Mitchell of the Brunel University talking about the BRUCE project.  She mentioned BRAD (their Symplectic Elements) and repository BURA (which coincidentally I helped formally launch back in 2006).  They have linked BRAD and BURA together, although they noted that this was a longer process than they expected.  They both a mandate, of which many academics remain unaware, but also a OA publishing fund for researchers to bid to for OA publishing.

BRUCE was a JISC funded project.  Their aim was to facilitate the analysis and reporting of research information from existing data sources, using a CERIF framework. It brings together a lot of different sources of information from across the university system and generates bespoke reports based on them.  While the focus is often the REF, there are other university management areas of interest for the outputs.

The next speakers (Sarah Mallory, Rachel Proudfoot and Nicola Cockarill) spoke about the RePosit project (I’m on the expert group for this one).  The aim of the project was to increase the engagement with repositories to generate more content for them.  A lot of the focus was on advocacy but also to engage with the repository community as well.  The project has 5 HEI and one commercial partners, using one CRIS and 5 different implementations.  The question they asked was does simplifying the process of deposit increase the level of ingest for the IR.  At Queen Mary part of their problem was low visibility, and so their engagement with stakeholders aimed to get them up the agenda.  Embedding it within college strategies was key in this respect.

Plymouth  rolled out their SE alongside their repository (PEARL) – but noted it was tricky in terms of time.  Not for the first time we head about how much of a time sink setting up crosswalks between SE and the repository has been too; something I know will occupy a lot of my time in the coming months.  Plymouth are considering moving to a self-deposit model, as they feel this mirror the model of staffing and library service.  However, noted that speaking with other repository managers Emma noted that there were various concerns to address.  Their advocacy was met with mixed reception, some were very enthusiastic.  For others though they struggled to see where it fitted in with their research outputs.  However, illuminating academics with the knowledge of how restrictive (or not) publishers in their sector are with open access is a role all subject librarian staff should be very experienced and engaged with.  Highlighting metrics of downloads and demonstrating that students want or indeed expect to be able to download their local academics research from the repository, important for keeping student experience levels high.

The third case history was from White Rose Repository Online (Leeds, Sheffield and York) where a similar experience to Leicester, 25% engagement from academics even after a protracted advocacy campaign including direct email contact.  Awareness of WRRO was generally low.  Making deposits as easy as possible was a major point, as academics are simple creatures with time poor lives.  They also suggested that there is a need to build a community of interest in CRIS related systems, not solely within Symplectic but across the IR, research support and IT environment.

Next up was Jonathan Breeze talking about research data management, from more of an IT and data life cycle POV.  Researchers think a lot about their data but how do you keep it or even what do you keep.  Research funders are increasingly expecting or requiring data as well as publications to be shared, and curated for long-term access.  Ownership of data is unclear, even within the institutions let alone whom or how this will be captured and stored.

Finally for the morning Peter Murray-Rust made a call for open bibliographies.  He declined to use PowerPoint or PDF on the grounds that they “Destroy information”.  He went on to say that we should use volunteers to gather bibliographic data rather than paid for systems.  He spoke a lot about community performing the data gathering or aggregation functions, but I must confess that while he raised some interesting points practically I think a lot of what he talked about was aspirational rather than functional.  Most academics I’ve worked with over the years have very little interest in collating the literature, they’re more focussed on their own area of research and outputs.  What Peter was suggesting was certainly laudable, and may have worked in the isolated examples he suggested but one has only to look to the Arts or Social Sciences to see where the technical knowledge or awareness may prevent many academics from engaging with his one.

After a sandwich free (but tasty) lunch we broke into two groups for workshops.  The one I was at looked at new REF functionalities for Symplectic, which as I’ve yet to have much hands on experience; and given this is more the research office’s forte, left me a bit flat.  Then we went into groups to discuss where the problems with REF submission functionality in Symplectic will be.  Again, somewhat out of my area of knowledge so not something I felt informed enough to contribute to.

All in all there was a lot to talk about with the other delegates on the day, and I especially benefitted from conversations with a number of my fellow repository managers; focussing on the implementation side of Symplectic Repository Tools.


On the Road to IRIS: Modules & Testing

Posted by gazjjohnson on 3 February, 2011

IRIS is a name you’ll be hearing me talk a lot about this year on here and in the flesh.  It’s the name we’ve given to the prospective new research information management system that our Research Office, ITS and library teams are working towards implementing.  My involvement is naturally on the repository side of things, considering how the LRA will integrate with the new system. We’re in early days as of yet, and the inks not quite dry on the supplier contract yet so I can’t speak too much about that.

What I did want to blog about was the related LRA work we’re currently doing.  One of the long standing requirements for the IRIS project is to upgrade to DSpace version we currently use (1.4.2. fact fans) to something…a little more this decade (1.7).  An upgrade to the software has been something I’ve been trying to move towards for the past couple of years, and now we’re moving towards this at speed I couldn’t be happier.

It looks like we’re going to have a test instance of the platform up and running in the next few days, and so I’m starting to think about two critical things for the live system.  The modules that are essential for the way the modern repository needs to run, and the kind of testing that we need to put the test instance through so we can be sure it’s running sweet and dandy and fine as candy.  I’ve some ideas already, some from my repository wishlist others from ideas that have come to me while I’ve been talking with the other members of the IRIS team.

But naturally I’ll welcome suggestions from any readers of the blog or pointers to resources that I clearly should already know about testing DSpace…but clearly don’t!

Repositories and CRIS WRN event article

Posted by gazjjohnson on 23 August, 2010

Nick Sheppard, Leeds Met University (aka MrNick on twitter) has written a good article in the most recent Ariadne about the Welsh Repository Network/JISC workshop back in May looking at the interaction between CRISes* and repository systems.  As I was unable to get to this event due to prior commitments, it was good to have a chance to catch up on the discussions.

I was interested to note that a CRIS (Current Research Information Systems) can go by many names – given that the UoL Research Office often refer to them as RIMS – Research Information Management Systems.  They’re not alone as many universities seem to have renamed them as RMAS or ERA and the like.  But at their heart they are systems that not only gather in research publication data (and much more), but actively link to other systems – chief among them from my perspective interlinking with a repository.

The question “Is an IR a subset of a CRIS?” posed by one speaker (Simon Kerridge, ARMA) is an interesting one.  Having seen a number of recent CRIS vendor demos, it is one that is clearly approached in different ways by different organisations.  Some very much see the IR as a satellite system, fed largely (but not entirely) by the CRIS.  For others it is more of a subsumed system – with a visible front end peeking out, but the rest of the body absorbed by the greater whole.  I must confess so long as the workflows for such issues as rights verification and data management are still handled by the elite repository administration team I don’t have an especial problem either way.  However, if a CRIS/Repository union means that a repo is just a reflection of the CRIS data set, locked down without the additional resources embodied and ingested by the IR over and above the REF related items; well then I’m a little more uneasy.

The talk from St Andrews’ Data architect Anna Clements (which came with some interesting but not readily comprehensible diagrams) brought up the CERIF standard.  Interesting that St Andrews has been pursuing links to their repository for far longer than many other institutions, which has demonstrated the advantages of working closely together with research support personnel (something I’ve benefited from here at Leicester in the past two years and can heartily concur).

Meanwhile William Nixon and Valerie McCutchean of Glasgow gave a very useful overview of the integration of the repository with a CRIS.  I was able to plot from my own experiences whereabouts we are in this process here at Leicester.  They raised a valuable point about author authorities – something that has long concerned me as an issue to which I don’t have a ready solution.  In some regards I’m hoping the CRIS implementation here will allow us to tackle and resolve this at that point – given that unique IDing of authors is something that is key for bibliometrics and REF returns alike.  I notice William doesn’t appear to have offered a solution though in his talk, which is perhaps a slight concern for me.  I wonder how difficult it is going to be to match an author of a non-REF item that routes into the repository from beyond the CRIS with the institutional verfiied author list.  And what about external additional authors?  I suspect this is going to be a major issue for me and my team to resiolve and one that I’d welcome external insight on.

Finally my old friend Jackie Knowles talked about the pitfalls of implementation – most of which I am, thankfully, already well aware.  I think we definiely need more of these warts and all case study examples though; as at the end of the day those of us working at the sharp end of repository/CRIS interlinking will need to know how to work around so many of them.

It sounds like this was an excellent day (and perhaps in serious need for near future repeating!) and a definite must read artilce for anyone about to establish, or already working towards, a CRIS/Repository interlink.

Ready for REF CERIF Workshop (King’s March 2010)

Posted by gazjjohnson on 24 March, 2010

Waterloo Campus, King's College LondonThis Tuesday I travelled down with Steve Loddington of the Research Support Office, to King’s College London’s Waterloo Campus to attend a Ready4Ref (R4R) CERIF workshop. Following an overview of the day from Mary Davies (Kings) the day proper began. What follows are my notes and comments on the sessions, hopefully slides from the event will be available online shortly.

CERIF4REF, Richard Gartner, Kings College
CERIF standard is very complex, almost too complex for most users to fully understand. The RAE2008 was used to shape CERIF as REF2012 standards remain as of yet unannounced. Repositories and CRIS outputs on systems that adhere to the standard can be feed through CERIF4REF and provide a single output to the REF assessors (in principle). There is a data dictionary that defines the standard and the elements within it. Going to take RAE data from Kings and process as part of a trial to check that this works well.

CERIF, CRISes & Research Databases Marc Cox
Marc talked about King’s CRIS, developed in house 2004-7, and developed originally as a research management tool; although the RAE overtook and drove it towards administrators rather than academics which was not the original intent. Took data from HR, student, awards and finance & publications from TR WoS (author ID a problem) – now use the WoK API to take data, although that was quite a challenge. At the moment administrators (mostly) and academics (few) are keeping the publications up to date.

CERIF is a standard data model that describes research entities and their inter-relationships, originally developed with support of EU. It is architecture independent. 4 main data fields from RAE2008 taken for CERIF4REF. A number of system and data tweaks were needed to these four research data fields to make it compatible with CERIF. RA1 data was relatively easy, although RA2 data was more difficult to map. RA3a/b and RA4 couldn’t be mapped without the base data which created them.

Benefits from the approach however included RAE forms generated from style sheets that can be cross compared with php scripts to check accuracy. Next steps are to generate real King’s data in CERIF xml format and exchange data with other CERIF compliant systems.

Using ISI Web of Science Data in Repositories, Les Carr
EPrints has had plug ins that do this for a while on an individual basis, but due to change in licenses now have access to API for direct deposit. SWORD based ISI Deposit, for EPrints was examined, although as Les noted the technology wasn’t at the heart of the issue as all repositories work in a similar fashion in the big picture. There is a need for a repository editorial step – which is a manual step, so can be like drinking from a fire hose – too much data flooding in and how can you deal with it with established workflows. The data download may not be straight forward exercise, e.g. student papers and non-peer-reviewed items are listed on WoS as well as academic papers. Les showed an example of selecting one academic and the process to go through to weed out the non-relevant items (a manual process) – 38/items ingested initially a minute, about 10 minutes for manual process and removal of duplicates and irrelevant. Questions of how to use this – monthly update? On a per user basis?

Les moved on to look at repositories as a CRIS – since repositories manage research or teaching or academic outputs and are broader in description and purpose. But what about other databases and information resources across campus (Finance, HR, Grants database. CRISes pull all the disparate data together and present a unified view of it; which includes the repository. Eprints has attempted to accommodate the CERIF data – not just publications but projects and organisations.

E.g. previously a project was added in the metadata – now they are objects in their own right, linking from metadata record to a page about the project itself; with contributors rather than authors. Data can be exported and imported in CERIF format. This joined up integrated resource can help develop research case studies for demonstrating impact and output. I imagine useful though this is, it does add yet another load to the already busy repository administrators workflows. However, I can see a significant advantage to the repository that offers this kind of joined up service. I doubt Leicester will go this route, given our interest in a separate CRIS systems at the heart of the research management agenda.

After a brief Q&A session we moved onto lunch. After lunch we broke into two discussion groups, one looking at the perceived benefits or flaws in CERIF; along with the practicality of auditing and standardising institutional systems with it. These sessions then reported back on the points that had been raised. Notably on average for those in attendance having data information systems that could be audited and made compatible with the CERIF standard was a reasonably attractive opportunity, however there were mixed concerns on the technical expertise being available in house at short notice to participate. When it came to staff resource available to take part in such an audit, virtually the entire group felt that this was the biggest obstacle to overcome.

Overall this was an interesting day, and while it was more on the CERIF data standard than the REF itself as I had hoped I was still able to take away some points for further thought.

[Edit: Slides from the event are now available here:]

