A quick open letter to eResearch@UTS

2015-02-16

2015-02-17, from a tent at the University of Melbourne

Hi team,

Thanks for a great first week last week and thanks for the lunch Peter Gale - I think I counted 12 of us around the table. I thought the week went well, and I actually got to help out with a couple of things, but you'll all be carrying most of the load for a little while yet while I figure out where the toilets are, read through those delightful directives, policies and procedures that are listed in the induction pack, and try to catch up with all the work that's already going on and the systems you have in place. All of you, be sure to let me know if there's something I should be doing to start pulling my weight.

As you know, I have immediately nicked-off to Melbourne for a few days. Thought I might explain what that's about.

I am at the Research Bazaar conference, #Resbaz.

What's a resbaz?

The site says:

The Research Bazaar Conference (#ResBaz) aims to kick-start a training programme in Australia assuring the next generation of researchers are equipped with the digital skills and tools to make their research better.

This event builds on the successful Doctoral Training programmes by research councils in the UK [1] and funding agencies in the USA [2]. We are also looking to borrow the chillaxed vibe of events like the O’Reilly Science ‘Foo Events’ [3].

So what exactly is ResBaz?

ResBaz is two kinds of events in one:

ResBaz is an academic training conference (i.e. think of this event as a giant Genius Bar at an Apple store), where research students and early career researchers can come to acquire the digital skills (e.g. computer programming, data analysis, etc.) that underpin modern research. Some of this training will be delivered in the ‘hands-on’ workshop style of Mozilla’s global ‘Software Carpentry’ bootcamps.

You can get hands-on support like at an Apple Store’s Genius Bar!

ResBaz is a social event where researchers can come together to network, make new friends, and form collaborations. We’re even trying to provide a camping site on campus for those researchers who are coming on their own penny or just like camping (dorm rooms at one of the Colleges will be a backup)! We have some really fun activities planned around the event, from food trucks to tent BoFs and watching movies al fresco!

It's also an ongoing research-training / eResearch rollout program at Melbourne Uni.

But what are you doing there Petie?

On Monday I did three main things apart from the usual conference networking, meeting people stuff.

Soaked up the atmosphere, observed how the thing is run, and talked to people about how to run eResearch training programs

David Flanders wants us to run a similar event in Sydney, I think that's a good idea, he and I talked about how to get this kind of program funded internally and what resources you need to make it happen.

Arna from Swinburne told me about a Resbaz-like model at Berkeley where they use part-time postdocs to drive eResearch uptake. This is a bit different from the Melbourne uni approach of working with postgrads:

@ptsefton Data-driven discovery (project driven). What we'd like to do at Swinburne: http://t.co/Q6t80txxkm Also check out @NYUDataScience

— Arna Karick (@drarnakarick) February 16, 2015

Attended the NLTK training session

This involves working through a series of text-processing exercises in an online Python shell, iPython. I'm really interested in this one, not just 'cos of my extremely rusty PhD in something resembling computational linguistics, but because of the number of different researchers from different disciplines who will be able to use this for text-mining, text processing and text characterisation.

Jeff, can you please let the Intersect snap-deploy team know about DIT4C - which lets you create a kind of virtualised computer lab for workshops, and, I guess, for real work, via some Docker voodoo. (Jeff Christiansen, is the UTS eResearch Analyst, supplied by our eResearch partner Intersect).

Met with Shuttleworth Fellow Peter Murray-Rust and the head of Mozilla's science lab Kaitlin Thaney

We wanted to talk about Scholarly HTML. How can we get scholarship to be of the web, in rich content-minable semantic markup rather than just barely-on the web. Even just simple things like linking authors names to their identifiers would be a huge improvement over the current identity guessing games we play with PDFs and low-quality bibliographic metadata.

Kaitlin asked PMR and me where we should start with this, where would the benefits be most apparent, and the the uptake most enthusiastic? It's sad but the obvious benefits of HTML (like, say being able to read an article on a mobile phone) are not enough to change the scholarly publishing machine.

We've been working on this for a long time, and we know that getting mainstream publisher uptake is almost impossible - but we think it's worth visiting the Open Educational Resources movement and looking at textbooks and course materials, where the audience want interactive eBooks, and rich materials (even if they're packaged as apps, HTML is still the way to build them). There's also a lot opportunity with NGO and university reports where impact and reach are important, and with the reproducible-research crowd who want to do things the right way.

I think there are some great opportunities for UTS in this space, as we have Australia's biggest stable of Open Access journals, a great basis on which to explore new publishing models and delivery mechanisms.

I put an idea to Kaitlin which might result in a really useful new tool. She's got the influence at Mozilla and can mobilise and army of coders. I hope there's more to report on that.

Kaitlin also knows how to do flattery:

Talking about scholarly HTML and the future of authoring with two #openscience greats: @ptsefton + @petermurrayrust pic.twitter.com/8zvaEB9SGH

— Kaitlin Thaney (@kaythaney) February 16, 2015

TODO

Need to talk to Deb Verhoeven from Deakin about the new Ozmeka project, an open collaboration to adapt the humanities-focussed Omeka respository software for working-data repositories for a variety of research disciplines. So far we have UWS and UTS contributing to the project, but we'd love other Australian and global collaborators.
Find out how to use NLTK to do named-entity recognition / semantic tagging on stuff like species and common-names for animals, specifically fish, for a project we have running at UTS.

This project takes a thematic approach to building a data collection, selecting data from UTS research relating to water to build a ‘Data Hub of Australian Research into Marine and Aquatic Ecocultures’ (Dharmae). UTS produces a range of research involving water across multiple disciplines: concerning water as a resource, habitat, environment, or cultural and migratory medium. The concept of ‘ecocultures’ will guide collection development which acknowledges the interdependence of nature and culture, and recognises that a multi-disciplinary approach is required to produce transformational research. Rather than privilege a particular discipline or knowledge system (e.g. science, history, traditional indigenous knowledge, etc), Dharmae will be an open knowledge arena for research data from all disciplines, with the aim of supporting multi-disciplinary enquiry and provoking cross-disciplinary research questions.

Dharmae will be seeded with two significant data collections, a large oral history project concerning the Murray Darling Basin, and social science research examining how NSW coastal residents value the coast. These collections will be linked to related external research data collections such as those on TERN, AODN, and, thanks to the generous participation of indigenous Australians in both studies, to the State Library of NSW indigenous data collections. Dharmae will continue to develop beyond the term of this project.
Make sure Steve from Melbourne meets people who can help him solve his RAM problem by showing him how to access the NeCTAR cloud and HPC services.

[ptsefton.com] | [CV & Bio]