I have been thinking lately about repository architecture, and wondering about whether the term ‘repository’ is actually skewing our idea of what we should be doing to preserve and disseminate via Open Access, and report on research output and the other things we want out repositories to do.
One of the ideas I’m keen on exploring is what librarians call a “discovery service” or a “discovery layer”, basically a web-view of a number of different services brought together in a smart index, with faceted browsing, like the National Library of Australia’s Single Business Discovery Service prototype which uses Apache Solr indexes to search across eight different types of resource all at once. In this post I want to give an example of another place where this might be useful, using an example that I came across when I was looking for examples for the conversation with Les Carr about PowerPoint dismantling.
I’m not picking on Southampton here, they’re leaders in Open Access or the EPrints software they have given our community, which is wonderful (in parts) but I would like to use their two (are there more than two?) EPrints repositories as an example of the advantages of a Solr-powered discovery layer.
How this came about was, I was poking around the Southampton EPrints site and I grabbed a PowerPoint presentation to use as an example; I knew that it was by Liz Lyon and Les Carr because that metadata was in the file, and The Fascinator managed to extract it. (Turns out there was a bunch of other authors listed on the EPrint, but maybe they were not authors of the PowerPoint.)
Now, lets see if I can find it again. What you’re about to read below might reflect badly on my search skills but this is roughly what I did, I’m sure there are other bumblies out there like me.
A search on e-Prints at Southampton has 8 matches for the string “Les Carr”. But thing I’m after is not in that list. Took me a while to remember (and I did have to remember, the site doesn’t tell you this) but there’s another EPrints site for the School of Electronics and Computer Science (branded EPrints rather than e-Prints). So, having found the ECS EPrints site, I had a search for Les Carr. No results. Searching for plain-old Carr in either site gave me too many results to sort through. After that I went back to the blog post where I talked about the thing and found a direct link – and yes it’s there in the Southampton IR (not ECS) but the name I was after is Carr, Leslie. I can find it using that search (and I note that it was deposited by Carr, Dr Leslie). Right. Got it.
Author
Carr, Les (56)
Carr, Leslie (55)
Carr, Les A. (38)
Carr, LA (15)
Carr, L (10)
Carr, Leslie A. (10)
Stepping back a bit I note that these EPrints sites are not that easy to find – I got to them by typing in the URL, but if you go to the university home page and browse to Research it’s not obvious where the EPrints are. A search for Les Carr works, as you can get to his home page where there are of course links to recent publications in EPrints. That’s better than what you get searching for Peter Sefton on the USQ site where a lot of the hits are from things like the test content that ships with ICE. No sign of my EPrints stuff in the first couple of pages).
With Fez we allow author name strings to be tagged on publications with Fez author ids. We link these author ids to staff numbers, Thomson ResearcherIDs, fez user account login ids. Helps a lot with very common names, research reporting and general searching (and fez solr faceting) of the repository for an authors true list of publications. I can’t believe all repository software packages don’t offer this sort of thing, or do they? Not a few years ago when we were deciding which software package to go with, and this is one of the main reasons we decided to go our own way. This idea of putting an identifier on an author name string on a publication seems surprisingly rare even on journal publisher sites. Researcher ID looks like it will help here, especially with the latest version. But i still think this should be done inside the repository as well, not just by an external service so you can link all the ID providers together, among other benefits.