As always, this blog is my opinion, and the stuff I write here is © Peter Sefton. I’m emphasising this because it is important to make the distinction here between what I personally think and my role at USQ, and the CAIRSS [update: fixed link thanks to @eric in comments] service we help provide for CAUL. I would like to note in particular this post does not represent the view of USQ or the Australian Digital Futures Institute and is not written on behalf of CAIRSS. This is post is my personal opinion.
I think that there is a gap in the market in the Australian Repository scene; I think a lot of sites would be interested in services and/or hosting around a completely open source software stack. In this post I will outline the kinds of services I think might be attractive, how they might be provisioned, which bits of software I think would be suitable.
I am writing this for three reasons.
-
I see frustration in the repository community that needs to be resolved. I’ve never used that phrase about the elephant in the room before, but I can smell one.
-
I think that if it is resolved then my group at the Australian Digital Futures Institute (ADFI) might be in a position to supply some of the services.
-
I don’t think that it is appropriate for ADFI or USQ to try to enter the market as a service provider or I would not be posting this, I’d be talking to our office of commercialisation.
Regarding frustration: I have been at public meetings where people have stood up and said quite explicitly that a particular piece of commercial software is not working properly for them. I know that some of the sites that use that software are shopping around for other solutions. I know that at least one bought it but decided not to deploy it. I know that it in the past it has been hard to upgrade between versions. I know it has had performance and stability problems. And I have heard that it is very hard to get it configured to work for the ERA. You, the customers know who you are. My question is why don’t you get together and (a) start making some noise and (b) encourage a bit of open competition? Together you represent an attractive market, with a set of similar repositories that are sitting in a very open easy-to-migrate system, with a dead-simple content model. If I was a customer I would be asking the vendor to make the whole thing open source, so the community could fix some of the bugs and/or seeing if there was a compatible open solution with someone offering support – open source means open market for services.
Which reminds me, where is the Australian Dorothea Salo, blogging from a repository rat-hole about what’s really going on with repositories and the software that drives them? There’s a gap in the blogospehere now that she’s quit.
The gap: commercial repository service on an open source base
I know there is at least a small market for services because USQ has supplied such services to two small Higher-Education providers, helping them to set up their ePrints repositories and assisting them with sourcing a hosting provider. I know that there is a market for commercially supported repository software because a significant number of IRs in Australia are commercially supported. I don’t know of anybody actively developing and promoting a service which offers commercial support for open software, something like Southampton offers for ePrints. (You can get support for Fez, apparently but as I write this the vendor is certainly not making it easy to find that out on their site).
So what I’d like to see is one or more providers set up shop offering the following:
-
Training, consulting, maintenance, help-desk support and software for running an IR where all the Software is open source.
-
Hosting for institutions who don’t want to do it themselves.
That’s it. It’s simple. Generally speaking there is no more inherent risk in ‘buying’ a supported open source system than a closed one. In some domains buying proprietary may get you lots more features – but I don’t see that in this market.
As a library you could treat it like any other software deal, except that at the end you not only get to keep the software, you can do what you like with it, subject to the terms of the open source license. I would imagine you would be looking at contact terms of something like 2 to 3 years to make this viable for the vendor.
There are a few models for this that I can think of, I’m sure you can think of more so use the comments below.
-
The Southampton/ePrints model, where the copyright holder for the software offers services. I believe their business is going well.
-
A third-party model, where someone just sets themselves up in business – the software is free – away you go!
-
A hybrid model where one group does the training, deployment and support work, while subcontracting development to the relevant software owners. For example if you were offering support for ePrints you might try to negotiate something with Southampton to help fund the development of the product and resource bug-fixing activities.
[Note that above I declared an interest in this idea – I think my employer might be interested in us doing some of the development and consulting in a model like 2 or 3]
Based on a few years in this game, and running the technical team for the RUBRIC project, If I were going to set up a ‘classic’ Institutional Repository for the purposes of disseminating research via documents produced by your researchers, then I would probably pick ePrints. It works, it is stable and reliable, and it has all the workflow stuff you need for managing an IR, and every time I poll my technical team they agree. I think ePrints is less attractive for other stuff like photo collections and so on, but they keep working on it.
But of course the first phase for a potential vendor in this space would be to work out which packages they were going to support.
-
If you were going after existing ARROW customers then you’d go for something that was compatible with their existing repositories, which happen to be set up and running in Fedora and mostly using the simple VALET system for ingest so you would probably look to keep the Fedora stuff and maybe migrate to the new ARROW-funded version of VALET, that my team helped write. I think that ePrints might be a good choice for some of these sites for their core IR functions as it vastly more configurable and usable than VALET, but that may be a bit hard to swallow. I am predicting a version of ePrints with a Fedora back-end, which might make people feel more comfortable about it.
-
If you detected a need for DSpace support then that would be simple; support DSpace unless those users are all looking to get into Fedora in which case you can wait (probably a long time) to see if the new DSpace/Fedora foundation builds you a hybrid.
-
Like I said before I already know there is a market for ePrints services.
-
If you wanted to take on Digital Commons or DigiTool, I’m not sure what you’d do.
It has been a while since I have caught up with Fez and Muradora both of which are alive and well, so you would want to consider those, at least.
But I don’t think ePrints or any other IR is the whole story. I would prefer to see a much more flexible discovery system over the top of the IR. So if I were entering this market then I would also be pitching the thing that is popping up everywhere in library systems these days, Apache Solr.
For example, the NLA are testing a new Solr system across several data sources.
With lots of libraries using Solr for their catalogues via applications like VuFind I think Solr is a new standard bit of infrastructure which is actually much more important to the customers than what you have on the back-end. Who cares about DSpace vs Fedora vs ePritns when you can mash it all up together and make it easy for people to create their own view of the repository using a bit of Javascript embedded in a web page, or a little site written using a rapid development framework like Django or Rails.
Looking at our own situation, I think USQ should have a big fat Solr index of their course materials, their Intranet, their IR, their data (described using RIF-CS), feeds from their staff and student’s workblogs, delicious tagged-links, digital assets from the photography department, etc. That doesn’t mean it’s all accessible to everyone, Solr can use limit queries for security so that certain groups of people only see certain things, as seen in The Fascinator which can act as a proxy in front of Solr to handle security.
If there’s any interest in seeing the kind of service offerings I think would make sense spelled out in more detail let me know in the comments.
You know, I’d almost talk to LibLime about this. I don’t know that they do work in Australia, but since they started with Koha, I would think they’d have connections in that part of the world, at least. You’ve made a darn good business case here.
Hi Peter
You’re missing the second s in your url for the CAIRSS link in paragraph 1.
@Eric – thanks, I fixed that
My impressions after reading this, and after reflecting upon the implications of the ERA requirements, are that we should seriously consider moving towards a single, integrated national repository. With the high mobility of researchers and the level of collaborative publications, it strikes me that after some years there will be significant duplication of effort with individual publications located within multiple repositories. This is a waste of time, money and effort. With technologies like Shibboleth and Fedora a national repository service is possible. A national repository supported by ARC, ANDS, CAIRSS and NHMRC would satisfy the government’s Open Access agenda, as well as allow for the simplification of ERA submissions and could also be used to supplement ARC and NHMRC grant proposal submissions.
I agree with Amberyn.
I want to add some of my reflections and opinions* on this from a Fez perspective. We initially did pay for support from UQ and received very good start-up support. However as things evolved we didn’t really need to have a formal support arrangement – so model 2 worked here – except working on repositories has really become more of an informal partnership between Fez unis where we collaborate and help each other out primarily through a web based chat service. I’m sure UQ are willing to offer support but there are some business models realities here. But also I’m sure they’d been interested in assisting a commercial repository service in some ways more then doing this directly.
From the outside things could be better – such as pulling together wiki documentation into a nicer framework, regular release cycles, better support on mailing lists. Things would be different if Fez had a larger Australia Uni based though. Also UQ are UQ focused with developing the s/w primarily for their purposes – which is only natural plus they innovate way too fast
– soem recent things include researcherID profiles and citation counts/cited by linking with Thomson, Scopus and Google Scholar.
I have some issues with LAMP software bundles but building on top of Fedora Commons is very forgiving and brings the best of web 2.0 innovation and software reliability together nicely. I certainly wouldn’t take Fez out of the mix. One day I’d like to see Fez evolve past being a LAMP bundle…
I’d love to collaborate to establish Fez migration pathways and in a way already have by adding CNRI Handle functionality to Fez.
And I highly rate Kettle/PDI for ETL (with ERA in mind) although this is a temp situation for us.
Some great points in this post and responses. Amberyn’s comment about Oz repositories containing a lot of wasteful duplication is so true, and I like the idea of a single repository used by the Oz unis. It would need to accommodate more than just research publications – theses and photo collections, for e.g. – if it was to be widely accepted. Plus I can envisage some concerns about security issues, particularly if storage was central. Oz universities already have a good history of collaboration, so they’re in a good starting place.
Concurring also with the frustration of getting even simple changes made to a commercial ILMS. Happily, on the repo end, we’re a Fez library, like Pete, and appreciate the flexibility of being able to make changes to the software as the need arises. It is so easy, that we are always trying to resist the tempation to make changes that are specific to ourselves, and hence going off on a tangent from the main Fez trunk. As for the support, UQ provides a great deal of informal support, but this isn’t something we should be relying on for the long term. Apart from UQ, I believe there is already a commercial firm called Catalyst which offers Fez support.
Open Access, preservation and user-friendly workflows could be more easily facilitated through a single Australian repository than through dozens of seperate repositories, I think. And there are instiutions which could host such a thing, whatever governance arrangements it would have. If it could go along with a centralised system for identifying researchers and grouping their work in sensible ways, it would be awesome.
I think the hard thing is that some repositories are now being used to meet very specific university-based needs – particularly reporting, but also around data. A centralised repository would make them less available as university management/reporting tools, as a long list of specific local requirements would get harder and harder to manage.
I don’t think it is a bad direction for repositories to become less about internal university needs, but some of the support from university managements seems to stem from their use as such.
But I’m not in the sector, so may not know what I am talking about
This is very interesting, but an obvious omission is precise identification of the systems currently in use and their specific failings. Whilst this blog does not need to be a ’shame file’ or ‘rat-hole’, it would be helpful if it – or some similar Australian site – gave unveiled critical comment. I use Digital Commons and am pretty happy with it, but I am also always on the lookout for something that is working better. I really do not know how all of the other systems are performing, both in regards to the basic issues of archiving and exposing research output, but also in dealing with the new kids on the block, namely ERA, images, connectivity with HERDC etc etc. I don’t think we need to pussy foot around in thise regard, and it would be helpful if we all talked openly on such matters. So thanks for your musings Peta – they are stimulating as ever, and that’sthe point isn’t it?
@amberyn, @bernadette & @alison
I am wondering if two things might happen:
1. Unis keep the local management part of their repository but let the outward facing OA stuff go and direct people to search in something like the NLA’s system or Google Scholar, as harvested from the discovery service.
2. Consortia might emerge where Unis share an IR, harmonize their metadata and their processes – this might be attractive to very small institutions.
Couple of comments:
1. The trouble with a central repository, tempting as it is, is the issue of ongoing funding. Who pays for it? Where does it “live”? Who is responsible for it? In my experience the government have been willing to seed repositories, but don’t want permanent responsibility.
2. On the other hand, I think the government would be quite happy if more institutions got together and ran shared services, rather than everybody having their own repository. So there are some legs in Peter’s latter suggestion.