Running an Open Source project from a university dev team

Steven Hayes from Arts eResearch at the University of Sydney invited me to visit their group and talk about running open source software projects, as they are making their Heurist (semantic database-of-everything) software open source. This was more of a conversation than a presentation, but I prepared a few ‘slides’ to remind me of which points to hit. Here are my notes. The focus here was not on why go open source, or open source in general, it was about doing it in a small university-based team. Comments about how various uni open source projects run would be appreciated.

I have been involved in creating two sizeable code-bases both released by the University of Southern Queensland as open source. They had very different histories. I’ll talk about both and how they run, although actually one of them doesn’t run any more in any meaningful way.

Two projects I started…

… on which other people* did most of the work

  • ICE – the Integrated Content Environment. Used at USQ for creating course materials for delivery online and in print. Almost no activity on this outside of USQ these days. Inside USQ? I don’t know for certain, but I think it is still in use, and finding a replacement has proven difficult, which doesn’t surprise me as that was the reason we built it in the first place).

  • ReDBOX – the Research Data Box (and The Fascinator, the underlying toolkit).

*Thanks to Ron Ward, Oliver Lucido, Linda Octalina, Duncan Dickinson, Greg Pendlebury, Daniel de Byl, Bron Chandler, Tim McCallum, Cynthia Wong, Jason Zejfert, Sally MacFarlane, Caroline Drury, Pamela Glossop, Warwick Milne, Sue Craig, Vicki Picasso, Dave Huthnance, Shirley Reushle and the late Alan Smith who made, tested, championed and supported these projects. Thanks also to funding from the Australian government via ANDS, ARROW and other streams. Sorry if I forgot anyone.

(At this point I wanted to check that everyone knows what Open Source means, making sure that we all understand how Richard Stallman made software free using copyright law. Whoever holds the copyright in a bit of software, which is likely to be whoever wrote it, or their employer can control distribution by using a licence, a legal instrument. Stallman’s insight was that a licence could be used to enforce sharing, openness and freedom: you can use this stuff I created provided you promise to share it with other people (that’s not a quote). Oh, and people working in this space should also understand the difference between Free and Open Source [1].

But I forgot.)

RTFM

Above, I linked to a free book on producing Open Source software [1] by Karl Fogel which seems to cover most of what you’d need to know. I haven’t read it all, looks useful.

But I don’t like this

The book begins:

Most free software projects fail.

I think that’s silly, talking about failure without first defining success.

Me, I’m not sure that all these scenarios Fogel lists are failures at all, there are lots of reasons to release code and they are not all necessarily about building a substantial community:

We tend not to hear very much about the failures. Only successful projects attract attention, and there are so many free software projects in total[2] that even though only a small percentage succeed, the result is still a lot of visible projects. We also don’t hear about the failures because failure is not an event. There is no single moment when a project ceases to be viable; people just sort of drift away and stop working on it. There may be a moment when a final change is made to the project, but those who made it usually didn’t know at the time that it was the last one. There is not even a clear definition of when a project is expired. Is it when it hasn’t been actively worked on for six months? When its user base stops growing, without having exceeded the developer base? What if the developers of one project abandon it because they realized they were duplicating the work of another—and what if they join that other project, then expand it to include much of their earlier effort? Did the first project end, or just change homes?

What’s the first thing that comes to mind when you think of Open Source?

Linux? Apache? WordPress?  Firefox?

The hits. The stadium-filling rock-star projects?

Your band has 99.9% probability of staying in the garage

Figure 1 Me (the good looking one) and cousin Tim at the Springwood Sports club, about to perform with a community uke-group. No plans for world-domination, playing for family, who are obliged to attend and even some people who , for some reason, choose to come. #Notfailure.

It’s important to work out why you are going to release software as Open Source – think about the audience. One very important audience is you, yourself. If you work on code as part of your job, then your employment contract may well mean that your employer owns the copyright. Do you want to be able to continue using it in your next job? Show potential employers? Making it open source helps your future self.

I know this first hand.

Universities are not as stable as they seem, or you may hope. At the Australian Digital Futures Institute at USQ we began by hosting code repositories and websites internally. I reasoned that the university would be a good bet for maintaining persistence of these resources.

But then one Gilly Salmon came to our institute to be the new professor, decided, along with the rest of the senior leadership team that there was altogether too much making the digital future going on in the Australian Digital Futures Institute, too much technology. They let just about all the technical staff go, no matter how useful they were to the organisation, or how pregnant they happened to be (we’re a relationship brand, the director of marketing told me, so we shouldn’t be continuing to develop software to deliver award-winning distance-ed services).

Web sites that would still have value are just gone from public view, including, ironically the PILIN project site, which was about persistent identifiers. Even the ICE website which is full of useful stuff for USQ itself now appears to be only accessible via the Wayback machine. They’re still using it but they turned off the website anyway, the code, however, is sitting on Google code so we all still have access to it.

This sort of thing happens all the time. For a couple of us, the NextEd refugees, this was the second redundancy associated with USQ. Kids, it is prudent to make sure that any code you might want to re-use later in your career is released under an open licence, and documentation, web sites etc likewise under creative commons. Think of it as a professional escape pod.

The ReDBOX project survived this ADFI shut down, because it had been open source from the beginning but further funding had to be redirected to another university which was willing to host the building of a digital future.

Lessons

  • Open Source can be worth doing even if the audience is your future self

  • Don’t trust someone else to keep your website up

  • If you want a community you’ll (likely) have to build it

  • Every project is different, so you need to structure yours around your users

Oh, and the answer to most questions is on Stack Exchange. I decided that this list was worth using as a starting point for discussion.

http://programmers.stackexchange.com/questions/51553/checklist-for-starting-an-open-source-project

Havoc P said: [with additions by me post the discussion at USYD]

Things I’d put in the early priorities are:

  • have a simple “what is it?” web site with links to some discussion forum (whether email or chat) and to the source code repository

    [Mailing lists are usually best IMO – forums can be empty, echoing and make you project look unloved. A tech list is a must, always, but other communications should be built around the reality of your project. No user community yet? Build one. Others over at Stack Exchange added that once you have a tech-list is best to hold or log all your discussions there so architectural decisions are transparent and the community can engage.

    On the ReDBOX project there are two main mailing lists, one for the techies and one for the users (mostly library staff), and lots of virtual and face-to-face get togethers. There is a committers group who are in charge of what gets into the trunk and various ad-hoc arrangements to sponsor sub-projects at the dozen or so sites using the software. The groups and how they interact were all created to serve that community, not from some manual of best practice, although it is all informed by collective experience of open source projects.]

  • be sure the code compiles and usually works, don’t commit work-in-progress or half-ass patches on the main branch that break things, because then other people’s work would be disrupted

    [Well, OK, but if you’re releasing an existing code base then don’t get too hung up on making things perfect (a) it will be a huge waste if there is no demand for your code and (b) don’t be unnecessarily shy, most open source projects are like busking, not stadium rock, nobody is watching you waiting to pounce on your errors.]

  • put a license file in the code repository with a well-known license, and mark the copyright owner (probably you, or your company). don’t omit the license, make up a license, or use an obscure license.

  • have instructions for how to contribute, say in a HACKING file or include in your README. This should include where to send patches, how to format patches, code indentation rules, any other important conventions of the project

  • have instructions on how to report a bug

  • be helpful on the mailing list or whatever your forums are

More from Havoc P

After those priorities I’d say:

  • documentation (this saves you work on the mailing list… make a FAQ from your list posts is a simple start)

  • try to do things in a “normal” way (don’t invent your own build system or use some weird one, don’t use 1-space indentation, don’t be annoyingly quirky in general because it adds learning curve)

  • promote your project. marketing marketing marketing. You need some blogs and news sites and stuff like that to cover you, and then when people show up interested, you need to talk to them and be sure they get it working and look at their patches. Maybe mention your project in the forums for related projects.

    [Yes, this is a huge one. One of the big differences between ReDBOX, which is no hit, but has a solid user base and ICE which never made it out of USQ is that Vicki Picasso from Newcastle Uni and I marketed the hell out of ReDBOX early to a very specific community of user-organisations. We needed a community so the software would have a sustainable base, so we designed the software for the community and sought input on the design as broadly as we could.

    With ICE, I talked about it to lots of the wrong people and didn’t sell it to the right ones, other distance ed unis, but that was partly because it conferred a competitive advantage on USQ. This comes back to the point above about success vs failure – there’s more than one way to succeed.]

  • always review and accept patches as quickly as humanly possible. Immediately is perfect. More than a couple days and you are losing lots of people.

  • always reply to email about the project as quickly as humanly possible.

  • create a welcoming/positive/fun atmosphere. don’t be a jerk. say please and thank you and hand out praise. chase off any jackasses that turn up and start to poison the community. try to meet people in person when you can and form bonds.

[1] K. Fogel, Producing open source software: How to run a successful free software project. O’Reilly Media, Inc., 2005.

Creative Commons License
Running an Open Source project from a university dev team by Peter (pt) Sefton is licensed under a Creative Commons Attribution 3.0 Unported License.