[update 2010-08-31 Clarified something]
Copyright © 2009 Published by Peter Sefton A while ago, I got an email for the nice people at Elsevier (actually I think it might have been from a machine, but never mind) saying that the article I wrote for Serials Review is now available here using this DOI: doi:10.1016/j.serrev.2009.05.001. At the moment that link seems to resolve to an open version of the article, whether or not you have a subscription to the journal but I guess that will change; when it is ‘published’ you will only see the article if you are clicking from inside a network that’s on their list of subscribers. If not, you will need money to see it. But I can post the article here with the copyright statement you see below and remind you that you need to use the DOI to cite the paper should you wish to. No naughty linking back here (unless it is to reference these comments I’m adding). And no linking to the version I’m about to put in ePrints. OK? Even though you know that if you do link to the DOI some people may not be able to see the article in the future, don’t do it, use the DOI link. There, I think I told you. The version I am posting here that you MUST NOT LINK TO is my copy, updated to match the changes made in the approval process, but Elsevier have yet to add their very valuable formatting (read on to find out more about what I think about that). Also, I have to warn you that me, I don’t understand is the license that Elsevier is extending to you, the reader. I asked them a while back (on June 2nd 2009). I got a reply that didn’t answer my questions,
which I present to you here. Here is the email of June 2nd:
Dear [deleted], Thanks for getting back to me. I have read the page you linked below [PS: it was this link http://www.elsevier.com/wps/find/authorsview.authors/authorsrights] and the pamphlet it links to and I still have two main questions. 1. What should the copyright statement on my final draft read when I deposit it in my local repository: (c) Peter Sefton or (c) Elsevier? 2. What LICENSE should be attached to the file? That is, how do I let readers know “those who download or access the PDF of your article must abide by your copyright agreement with regard to this electronic copy.” Is there a single page statement of my copyright agreement in its entirety somewhere? what rights Do readers have? For example, the version (#2) of the pamphlet linked from the author rights page is different from the latest version at Library Connect (#4) and they have different rules about how much text users may quote from an article. Neither has anything to say on how readers may store and distribute my version of the article. Is that up to me? Can I attach a creative commons license to it? PeterI followed this up this morning, but, until I hear back all I can say is that while I can post this here with the copyright notice below, I’m not sure you have permission to read it or keep it in your web cache or print it. Caveat Lector, I guess. This is as-approved except that I have fixed the most glaring problem with the references, a missing URL for a self-citation from the eResearch Australasia Conference last year. Remember, to cite this paper, use this: doi:10.1016/j.serrev.2009.05.001 From here down is not mine: ——- Copyright © 2009 Published by Elsevier Inc.
Towards scholarly HTMLSefton, is Manager Software Research and Development Team, Australian Digital Futures Institute, University of Southern Queensland, Australia Available online 30 June 2009. —words could be rearranged and edited without having to be re-typed from beginning to end or altered using physical means, such as cut and paste of typed text or obscuring typescript with correction fluid and then typing or writing over it. Then from the mid-1980s, desktop publishing began to democratize access to typesetting tools. Because of this desktop publishing revolution, word processors began to ship with a “What You See Is What You Get” (WYSIWYG) view, which encouraged text production and formatting to become one and the same operation, Coombs et al. give a useful summary of the state of the art in 1987.1 Before the Web, authors had become used to editing within the constraints of the A4 or US Letter page. It has been argued that this WYSIWYG desktop publishing revolution has had a counter-productive effect on the progress of publishing, and scholarly publishing in particular, by promulgating what Sorgaard and Sandahl call the ‘paper metaphor’.2 The World Wide Web is now the key distribution for scholarship, offering new potential for documents which are seamlessly integrated with machine-readable data and with human-readable visualization services for data as discussed by Murray-Rust and Rzepa in their paper introducing the term datument to represent this new kind of eScholarship.3 Since the Web arrived in the mid 1990s, writing tools, such as Microsoft Word, have failed to adapt to a Web-based publishing environment.4 Typical Web export from word processors producing HTML is far from standards compliant and unsuitable for use on journal Web sites or in institutional repositories. I will show below how this works with the dominance of the paper metaphor to reinforce the role of the PDF as the currency for research reporting. Word processors are merely the most common writing tool for the academy, not necessarily the best. Many proposals have been put forward for structured authoring using XML (and before that SGML) which has worked well in industries, such as legal publishing and the military where specialist editorial teams can be trained and supported. Norman Walsh5 captures the essence of the advantages of structured authoring in a contribution to a debate in the Journal of Digital Information. While the principle is sound in a theoretical sense, experience at the University of Southern Queensland (USQ) with an end-to-end XML publishing system for course materials has not been encouraging—the completed system had close to zero uptake from academic staff.6 It is true that many commercial publishers use XML in their production systems, but it is unusual for authors to contribute XML; in most cases authors submit word-processing files which are then converted to XML behind the scenes. I reviewed the state of the art and the (very limited) literature on how word processing might be integrated into back-end publishing systems in a 2006 paper for the Australian World Wide Web conference,7 and in 2008 for the Australasian eResearch conference.8 Since then Microsoft Research has released previews of a Microsoft Word-based tool which is claimed to produce XML conforming to the National Library of Medicine schema, but there is so far no evidence of the tool being used by typical authors. To illustrate the potential divide between the author’s version and the publisher’s, consider that Elsevier, the publisher of this journal, recently ran a competition, Article 2.09 to show the future of a scientific article. The competition winner shows that a journal article may be the Web locus for discussion, annotation and semantic relationships, but this competition was built on XML source documents which are created and held by the publisher, so there is no way that a typical institutional repository could easily provide the same services. This is a case where the publisher is shaping scholarly communications, or at least exploring how to do so, but a lack of tools means that repositories are unlikely to be able to do likewise. This creates a distinct divide between the publisher’s more richly marked-up version and the version held by the author in word processing format or the typesetting system LaTeX,10 neither of which allow high quality HTML unless the author has used a particular set of templates and/or macros and has access to specific conversion software. So there is no way for most author manuscripts, which are commonly deposited in institutional repositories, to be turned into usable Web content, let alone with links to data and semantic-Web content. The best most authors could hope for with their version would be to convert it to PDF and deposit in a repository, while the publisher can do much more with the article. Against this background, our work in shaping scholarly communications in the Australian Digital Futures Institute (ADFI) USQ has been focused on three areas:
- Empowering authors with content creation using tools that are not constrained by the paper metaphor so that, no matter what happens on the publisher’s side of the transaction, authors can use and re-use their work as flexibly as possible.
- Work on integrating with institutional repositories, particularly in packaging, using the Open Access Initiative protocol for Object Reuse and Exchange (OAI-ORE).11
- Most importantly, providing the above tools for use by post-graduate students writing theses; feeding new users into the academy who know how to create ‘Article 2.0’ type content for themselves.
- Automated document conversion with a rapid feedback cycle so authors can see their content in a Web context as they write.
- Inline-threaded annotation in a Web view. The screenshot in Fig. 1 shows a colleague commenting on a draft version of this paper. Figure 1. A screenshot of the ICE inline annotation system in a Web browser, showing threaded comments.
- Protocols for linking to data that support a document and make research more reproducible, as well as live visualization tools. This has been demonstrated in the TheOREM project [note we will reference a forthcoming paper at Open Repositories 2009 here—details TBC].
- As a complete distributed content management system as used for course creation at USQ.
- As a central content management system where groups can collaborate on content.
- As a set of software components that can be embedded in other applications. This has been done at USQ to integrate ICE conversion services into the Moodle learning management system, and some experiments have been undertaken with the Open Journal Systems software.
- Institutional repositories are expected to contain only PDF.
- Publishers ask for word or LaTeX documents which they then process into PDF and, maybe, HTML without any supporting data or complex visualization technologies.
- To take the proof of concept work from ICE-TheOREM and to do for theses what we did for courseware at USQ and become the first institution in Australia with a mandate for all theses to be made available not just on the Web in PDF but of the Web, in HTML. The most obvious way for a candidate to comply would be to use ICE but other tool chains could be used.
- To make the courseware process more like a research workflow by introducing post publication peer review for course content, thus turning an established workflow into a publishing model without attempting to change an existing system over which we have little influence.
- An author creates courseware which is published as open courseware by the university.
- The author writes a short paper abstracting a module or book from the courseware with an explanation of what the item represents: it might be a literature review or contain instructional design which is the product of research into previous cohorts of students.
- The author submits the paper to an existing journal or to a new kind of open access journal, which would be article-centric and arrange for peer review on a rolling basis as articles are submitted—with the output of the journal deposited directly into a repository of papers on pedagogical practice.
- Reviewers would be able to recommend not only changes to a paper to make it publishable, but to the courseware item itself.