I have been writing an article for the Special Issue of the New Review on Information Networking that Les Carr is putting together. He wondered if I would like to submit something on repository architecture, and since I was about to blog anyway about some of the ideas from a previous post here and keep a promise I made on the CAIRSS blog to outline an architecture by which a service like the NicNames identity management system might sit alongside a repository software application, I rolled those things into a (rushed) article, which will appear somewhere in due course – if it’s rejected I can always blog it.
One of the things I looked at in the paper was this notion that a repository should be viewed as a set of services rather than a monolithic application. I have quoted Clifford Lynch here before, and I quoted him in the paper:
In my view, a university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution. While operational responsibility for these services may reasonably be situated in different organizational units at different universities, an effective institutional repository of necessity represents a collaboration among librarians, information technologists, archives and records mangers, faculty, and university administrators and policymakers. At any given point in time, an institutional repository will be supported by a set of information technologies, but a key part of the services that comprise an institutional repository is the management of technological changes, and the migration of digital content from one set of technologies to the next as part of the organizational commitment to providing repository services. An institutional repository is not simply a fixed set of software and hardware.
The cognoscenti do go on like this. I know I’m guilty. I have been heard to argue that the repository should be thought of as an institution (or a lifestyle, or a way of life, or a governance framework) rather than a software application, but for most working repository managers I suspect that the repository is really much more closely tied to and bounded by the software application used than to some fancy ’set of services’ or governance framework. Try this. It seems natural enough to say “What repository do you run?” whereas you would never ask a librarian “What library do you run?”
Anyway, as I was thinking about an outline for the paper I did a quick unscientific (and slightly inept) poll on Twitter.
Hi repository people (& rat.) Can you define “repository” or “institutional repository” here in Twitter, in yr own words please?3:48 PM Jul 21st from web
It was inept because I should have used a hashtag for people to label their replies instead of them coming back to @ptsefton. I tried again a bit later.
#repodef Can you define “repository” or “institutional repository” here in Twitter, in yr own words please? Will blog results.
The reason I asked was I was interested to see if people’s definitions were of repository as monolithic-application, or tended towards service-orientation or repository-as-institution.
Presented below are all the responses directed at me. There may have been others, that were just tagged, but as I write this Twitter is returning zero results for a the tag #repodef.
I will comment on a few of the responses. An early participant was Tim McCallum from my team. He’s the CAIRSS (Repository Support Service) Technical Officer and and he had a very concrete view of the repository as ‘a system’ with an emphasis on research output, as I’d expect from someone up to his neck in the day-to-day issues with repositories and the System to Evaluate the Excellence Of Research (SEER) here in Australia.
mistermac2008@ptsefton Online system that receives, organizes and normalizes institutional research output for searching/harvesting by machine and humans4:43 PM Jul 21st from web
Ploy Tangtulyangkul, who works for us from wherever in the world she happens to be and has stoked the boiler of a repository in Western Australia had a rather succinct view, to do with research output, attributed to an ex-USQ person, Neil Godfrey.
waoewaoe@ptsefton – sorry research showcase – shall i quote @neilgodfrey for giving this concept to me?12:23 AM Jul 23rd from web
A couple of people mentioned the repository as ‘a place’ and there was mention of ’system’. Dorothea Salo, tweeting in her repo-rat super-hero costume had a definition I thought captured the Lynch view (she also had a silly definition):
TheRepoRat@ptsefton Legit defn: a service focused on capturing, disseminating, and preserving many sorts of digital institutional output.3:14 AM Jul 22nd from TweetDeck in reply to ptsefton
I think that along with the rat Andy Powell had had the definition which was closest to repository-as-institution. In the broadest sense a system might include people:
andypowe11@ptsefton “a system for managing and sharing academic content in order to support scholarly communication and/or teaching and learning” ?10:23 PM Jul 22nd from twhirl in reply to ptsefton
This is splitting hairs, maybe, but Paul Richardson’s similar answer implied that the store was the repository and the fact that it was managed was a property it had, in contrast to Andy Powell’s bundling of the the managing into the system.
paulbrichardson@ptsefton #repodef A managed store of digital assets. Metadata & security addressed. Copyright explicit. Access open or by subscription.10:27 PM Jul 22nd from web in reply to ptsefton
paulbrichardson@ptsefton #repodef I didn’t define, of course. Just put in some ideas which may help….10:29 PM Jul 22nd from TweetDeck in reply to ptsefton
Nigel Ward said that a repository was software, which from anyone else would kind of confirm my view that people think of repositories as a bit of software, but you have to remember that Nigel is an e-Framework participant, and they see everything as a set of services – that’s Service Oriented Architecture, only without the capitals; soa so I will count his response as being one for repostitory-as-computational-services.
nigeynige@ptsefton A repository is software that provides access to and manages collections.4:01 PM Jul 21st from Tweetie in reply to ptsefton
nigeynige@ptsefton A collection is any aggregation of content items, physical or digital [from ISO 2146]4:02 PM Jul 21st from Tweetie in reply to ptsefton
I’ll leave you with all the replies that Twitter is showing now, starting with the most recent. Comments welcome. No, no statistics or further analysis except some ‘don’t rely on Twitter to remember a hash-tag from two weeks ago’.
onothimagen@ptsefton ‘place to put stuff’ or ‘digital object storage and management system’ or ‘all things to all men’
1:34 AM Jul 23rd from Nambu
sshreeves@ptsefton inst repo same definition except gen focused on research/scholarship/intellectual output and also meant to protect inst investment1:25 AM Jul 23rd from TwitterFon in reply to ptsefton
sshreeves@ptsefton system to disseminate, manage, and aid in preservation of defined set of materials (depending on collection policy)1:23 AM Jul 23rd from TwitterFon in reply to ptsefton
waoewaoe@ptsefton – sorry research showcase – shall i quote @neilgodfrey for giving this concept to me?12:23 AM Jul 23rd from web
cgutteridge@ptsefton #repodef repository = “place to put stuff to make it easy(er) to find and hard(er) to lose”11:16 PM Jul 22nd from twidroid in reply to ptsefton
cardcc@ptsefton JISC definition makes journal (or journal site) a special kind of repository. Also any library, digital or not! #repodef11:00 PM Jul 22nd from TweetDeck in reply to ptsefton
cardcc@ptsefton ‘managed’ should mean some kind of catalogue/metadata (so a bit > filestore) & access control; not necessarily OAI #repodef10:58 PM Jul 22nd from TweetDeck in reply to ptsefton
cardcc@ptsefton I’ll suggest an IR is a repository run by/for an institution! May be more than one… #repodef10:56 PM Jul 22nd from TweetDeck in reply to ptsefton
cardcc@ptsefton #repodef The JISC definition from comments to http://bit.ly/g5ol010:55 PM Jul 22nd from TweetDeck in reply to ptsefton
cardcc@ptsefton JISC: repository a managed store of content that enables sharing of that content. Key ‘managed’, ’sharing’ & ‘content’. #repodef10:51 PM Jul 22nd from TweetDeck in reply to ptsefton
N3B@ptsefton A depot of all digital output from an institution for preservation and dissemination.10:48 PM Jul 22nd from web in reply to ptsefton
paulbrichardson@ptsefton #repodef I didn’t define, of course. Just put in some ideas which may help….10:29 PM Jul 22nd from TweetDeck in reply to ptsefton
paulbrichardson@ptsefton #repodef A managed store of digital assets. Metadata & security addressed. Copyright explicit. Access open or by subscription.10:27 PM Jul 22nd from web in reply to ptsefton
andypowe11@ptsefton “a system for managing and sharing academic content in order to support scholarly communication and/or teaching and learning” ?10:23 PM Jul 22nd from twhirl in reply to ptsefton
TheRepoRat@ptsefton: DevDict defn: a boondoggle started in hope and supported in fear of looking bad. (True mostly of US. Eur/Aus have better ones.)3:15 AM Jul 22nd from TweetDeck
TheRepoRat@ptsefton Legit defn: a service focused on capturing, disseminating, and preserving many sorts of digital institutional output.3:14 AM Jul 22nd from TweetDeck in reply to ptsefton
cardcc@ptsefton hope you will blog the definitions of repository and IR, as you didn’t give a hashtag. The R word a tough one; includes filestore9:45 PM Jul 21st from TweetDeck in reply to ptsefton
TheRepoRat@ptsefton … gah. Do you want a straight-up definition or an Ambrose Bierce Devil’s Dictionary one?5:10 PM Jul 21st from TweetDeck in reply to ptsefton
mistermac2008@ptsefton Online system that receives, organizes and normalizes institutional research output for searching/harvesting by machine and humans4:43 PM Jul 21st from web
nigeynige@ptsefton A collection is any aggregation of content items, physical or digital [from ISO 2146]4:02 PM Jul 21st from Tweetie in reply to ptsefton
nigeynige@ptsefton A repository is software that provides access to and manages collections.4:01 PM Jul 21st from Tweetie in reply to ptsefton
Pete,
I think you probably credit me with too much subtlety in my use of ’system’ rather than ’store’ or ’service’ or ’software application’ or whatever. The truth is that I probably chose to use ’system’ as in ‘content management system’ (not least because I remain firmly of the view that an appropriately managed website is a perfectly good choice as a “system for managing and sharing academic content in order to support scholarly communication and/or teaching and learning”). I wish I’d meant it more broadly. The ’system’ certainly has to be understood more broadly than just the software (people and workflows and social interactions and …) though I’m not sure I fully understand your use of ‘repository-as-institution’.
I agree with you that, despite Cliff Lynch’s long-standing ’set of services’ definition, in practice, for most of us, the repository is very closely associated with a particular ‘bit of software’. I’m not saying that is right or wrong, just that that is how things have tended to pan out in the wild.
To the people that used the word ’store’… my gut feeling is that it is far too narrow – actually, I think that one of the problems with repositories is that they have been seen as ’stores’ rather than as (social) ’systems’.
Other than that, I think this was an interesting experiment – thanks for writing it up. Perhaps this “use of Twitter to define X” is something we should try more often? That said… the trouble with 140 characters is it just leaves us with something that then has to be explained in much greater detail!
Hi Pete
`The question is,’ said Alice, `whether you can make words
mean so many different things.’
It’s interesting how hard it is for your correspondents to keep the adjective “institutional” away from the noun “repository”, either explicitly or implicitly in their definition! Logically, shouldn’t we define “repository”, then the subset that is “institutional” (and, further, “academic”, “scholarly”, “trusted”, “subject”, “dataset(ish)”, etc)? If all repositories were institutional, then retaining the adjective would be tautologous.
Mustn’t a repository, by definition, first and foremost, be somewhere to put and keep stuff – where it reposes or is reposed? By reasonable inference, also, somewhere to find it again, e.g. via catalogues and metadata, but these and their further clever uses are still side-effects. (You could create a repository just to catalogue, not to store, but that seems wasteful.)
A repository can’t be software, any more than a “database” is software. A database is what you might create /with/ DBMS software; in a similar vein, we probably need clearer terminology to distinguish between the instance (repository) and what creates it (RMS, IRMS?).
I came at the IR thing a few years ago, after working previously on digital archives systems that were less explicitly academic. The two endeavours have lots in common (storage, management, discovery, retrieval) – and there is more to the “systems” than just hardware and software – but then there was that word “repository”, and everyone involved had a different concepton (or mis~), inevitably coloured by their own (institutional) position and experience. And I started to understand just how Alice felt.
‘There’s glory for you! ‘
Hope you can get to the bottom of it – I will keep watching this space
I work on a team called the Repository Development Center at the Library of Congress and I know we have struggled with what the word ‘Repository’ means in our environment. Over the past few years I’ve gradually begun to appreciate that the problem with thinking about repositories in the library field is that repositories are essentially libraries. So the scope of repository work naturally expands to the scope of the library itself … which makes it hard to manage. As more and more library professionals are familiar with digital content I think there will be less and less of a need for the term. Whether this will happen or not, I’m not quite sure…