In this post I question the use of the word ‘own’ in relation to research data. Is it misleading to talk about owning data? This came up as I was doing research into policies and procedures for research data management, in the context of projects funded by the Australian National Data Service, designed to promote data re-use and sharing. Feedback wanted!
(Disclaimer: I discuss here issues that research organizations have to work out at a policy level and I am certainly not going to attempt to do that here and now. This is my private blog. I am not a lawyer. I’m working at UWS with others on policy in this area, alongside the work we’re doing on a Research Data Repository but this is not UWS speaking.)
Richard Stallman famously urges us to reject the term Intellectual Property – ‘IP’ to its friends – on the grounds that it confuses and conflates several legal frameworks under one term: Did You Say “Intellectual Property”? It’s a Seductive Mirage. People have an intuitive sense of what property is like, who has rights to do various things with it. Problem is, when it comes to rights in intangibles the intuitive view is usually wrong, and talking about ‘my IP’ gives a false sense of propriety, a sense that somehow the products of one’s intellect can be locked up like a house, or fitted with an immobiliser like a car. I tend to agree with Stallman that IP is not a usually a useful generalisation but I’ll use it here, because the point of this post is to discuss property rights in data.
Now, to the point; the concept of ownership of data. I’m sure I’ve talked about owning data, and I heard it a lot last week at an eResearch meeting about research data management in Sydney. Every time I hear it now I wonder, what does it really mean to own data? It is certainly not about who owns the disk drives or the USB message sticks or the paper on which the data are stored. When we talk about ownership of something that falls under the banner ‘IP’, you need to specify which Intellectual Property right you own.
Think about this; if you wrote a book, would you talk in a casual setting about ‘owning’ it? No, at a party you’d say “I’ve written a novel”. Or with a research paper, you’d likely talk about having had it published, rather than owning it (not least because it’s likely you don’t own the copyright). There’s an inherent recognition of how copyright
works functions in the way people refer to these things, ‘owning’ is usually ‘owning a copy’ not ‘owning the rights to’, but the way we talk about data doesn’t seem to be so nuanced; you do hear people talking about owning data rather than having collected it, or compiled it or being responsible for its preservation and upkeep.
I am not a lawyer, but as far as I can tell in Australia, there are two types of Intellectual Property in data that you might own.
You can talk about ownership of copyright in a data set that has had sufficient creativity put into its compilation to make it a creative work (and no, nobody can tell you for sure what qualifies). Many advocates of open access research reject the use of copyright as a way of controlling research data (for example this group), but note that in Australia official advice is to use copyright licenses (see below).
Or there’s confidential information, or trade secrets. That is data that you take reasonable measures to protect, and restrict access by others by contracts limiting their rights. I have not seen much discussion of this aspect of IP law in the eResearch area but it would see reasonable to this non-expert non-lawyer that if you keep data private, lock your office and don’t publish it on the internet then you would be able to complain if someone took a copy (or the only copy, even) of some data. If you put it on the Internet without a license or waiver, less so.
Should we be talking about ‘owning’ data?
In conclusion, what Stallman says about the term Intellectual Property applies just as much to the word ‘own’:
The term [IP] carries a bias that is not hard to see: it suggests thinking about copyright, patents and trademarks by analogy with property rights for physical objects.
(Please comment below if you have a good definition of data ownership and/or you disagree.)
Postscript, what to tell researchers?
Curmudgeonly rants about terminology aside, what should researchers do?
In the context of the above disclaimer, my thoughts:
Don’t talk about ‘ownership’ of data without qualification as to which kind of ownership you mean – others may make completely wrong assumptions about what you mean. Remember that data collections are not subject to the same laws as physical property and thresholds for copyright differ from creative works like research papers.
Do consider how you would like to share, manage, preserve and re-use data to further the cause of research, make sure findings can be validated and to be cited as a data creator or compiler (There are lots of reasons to share. Lets assume those conversations are taking place as they should be).
Never share data with anyone without an explicit statement of what your expectations are for how it can or cannot be used, re-used cited and disseminated.
For openly available data in Australia the recommendation from ANDS is to use Creative Commons licenses, which are copyright-based licences (even though there is a degree of uncertainty around the extent of copyright in data). The CC licenses give you a way to express the terms under which you would like to share data.
For confidential or commercially sensitive data that can’t be shared openly talk to your office of commercialisation about appropriate contractual arrangements for data sharing.
Copyright Peter Sefton. Licensed under Creative Commons Attribution-Share Alike 2.5 Australia. <http://creativecommons.org/licenses/by-sa/2.5/au/>
Types of IP protection
IP rights give you the exclusive legal right to take advantage your IP and help you prevent others infringing it. There are different types of IP protection in Australia, each with its own legislation.
Patents - for new or improved products or processes.
Trade marks - for letters, words, phrases, sounds, smells, shapes, logos, pictures, aspects of packaging or a combination of these, to distinguish the goods and services of one trader from those of another.
Designs - for the shape or appearance of manufactured goods.
Plant breeder’s rights - for new plant varieties.
Copyright - for original material in literary, artistic, dramatic or musical works, films, broadcasts, multimedia and computer programs.
Circuit layout rights - for the three-dimensional configuration of electronic circuits in integrated circuit products or layout designs.
Confidentiality/trade secrets - including know-how and other confidential or proprietary information.
Nothing prevents the AusGOAL suite of licences being applied to data. The law with respect to the subsistence of copyright in factual information (for example, data) is currently being tested before the Courts. The outcomes of this litigation may cause further refinement of the application of Creative Commons licences. It is also important to note that not all data would fall into the spectrum being considered in the present litigation. Until the litigation is resolved, the Creative Commons licences remain an effective tool for the licensing of data.
In any event, Creative Commons licences will continue to adequately serve the purpose of identifying attribution requirements of the creator or publisher, and provide other benefits such as a positive and prominent notice of terms and conditions of use of the information to which they are applied. The AusGOAL Framework will be updated and or modified to address the outcomes of the current legal proceedings when they come to an end.