Wave as a Scholarly HTML editor

[Update: immediately after publishing I changed the title a little bit]

I did a series of articles here a while back about trying to use various word processors and editing tools to write scholarly works for publication in HTML. Then this year, I looked in more detail at what Scholarly HTML might be like.

Now it’s Google Wave’s turn. I got my invitation (thanks Jim) and immediately had a look at how it might be used for collaborative authoring for papers. I’ll cover two things here; (a) document structure and formatting tools and (b) citations. I won’t say too much about the collaborative aspects until we have tried it out, but they seem to work as advertised, a few of us were able to muck around in a document together without too much trouble.


The first thing I do with any editor is see how it structures documents. Wave has a slightly weird (to me) set of choices. It has headings, which is a start, and a whole lot of direct formatting tools, which is disappointing, and lacks both numbered lists and block-quotes which is a real pity.

But the really interesting (to me) part is the way it handles bullet lists.

If I put in a little list like this:

Then the HTML it produces is like this:

<pstyle="margin-left: 14px;" class="simulated-li bullet-type-0">List 1<br></p>
<pstyle="margin-left: 28px;" class="simulated-li bullet-type-1">List 2<br></p>
<pstyle="margin-left: 14px;" class="simulated-li bullet-type-0">List 1<br></p>

Not a list-element in sight! It uses plain-old paragraphs with a mixture of classes (which is fine by me) and direct formatting which is a bit less so: “margin-left: 28px;”. This will probably get some people riled-up, but actually I think it is a reasonable approach for an editing environment, provided one can export to proper HTML later and make something like this:

<li><p>List 1</p>
<li><p>List 2</p></li>
<li><p>List 2</p></li>
<li><p>List 1</p></li>

Why is this approach of using plain-old paragraphs reasonable? It’s because of the mess that most editors get into when editing lists. All the in-browser editors I have seen have major problems making sensible structure, they let you do stupid meaningless things like have two adjacent bullet lists, but don’t let you collapse them. I last wrote about this looking a Mozilla Seamonkey Composer, which is a desktop tool.

The Wave approach is very similar to what we do in ICE. ICE uses styles in a word processor (MS Word or OpenOffice.org Writer) to imply structure. In an ICE document the above list would be made using styles, shown here in braces. The ICE Toolbar has buttons just like the Wave ones to toggle bullets and promote and demote.

  • {li1b} List 1

    • {li2b} List 2

    • {li2b} List 2

  • {li1b} List 1

So, provided there is a way in Wave for us to add an export as HTML feature like the one in ICE which I’m sure there is, then I’m happy with the flat-paragraph approach. I would really love to see blockquotes supported and custom styles would be really great. But even if blockquotes aren’t supported we can look for indented paragraphs and map them to blockquote elements.


The other thing that’s essential for writing a paper is good reference and citation support. I asked on Twitter if there was a Zotero gadget yet, and Bruce Darcus pointed me to Igor, which doesn’t support Zotero, but does connect to PubMed, Connotea or Citeulike. It works by looking for text like:

(cite sefton 2006)

Then inserting a reference and updating the bibliography. I have not tried Igor but it looks like it is limited to one citation style.

All I would want would be a plugin to look for links to an online Zotero account like this: http://www.zotero.org/ptsefton/items/77278 or to a DOI, as I described back in April and provide a variant of the Zotero word processor plugin feature to format citations and bibliographies. One issue might be that as I understand it parts of the Zotero citation code depend on Firefox specific libraries, so can’t be made to function across-browsers.


I think Wave shows some promise as a collaborative editing tool, but it’s only going to be useful for simple documents to start with, what with the lack of table and numbered list support. I’d be surprised if Zotero support doesn’t manifest soon, but if it doesn’t then we’ll probably get around to that in my team at some stage.

Of course there’s lots more to talk about with the potential for embedding scientific objects etc, as discussed in this post. I’ll come back to that.

  • Bruce

    All I would want would be a plugin to look for links to an online Zotero account like this: http://www.zotero.org/ptsefton/items/77278 or to a DOI, as I described back in April and provide a variant of the Zotero word processor plugin feature to format citations and bibliographies.

    Tying such a plug-in to a single service is a bad idea. I’ve instead suggested generic global identifiers (URIs), and then allow configuration of trusted sources.

  • http://dablog.ulcc.ac.uk/ RIchard M. Davis

    Hi Pete

    Seems a shame to me that we have to consider “reasonable” something aesthetically and structurally inferior just because noone’s been able to create a WYSIWYG editor that can handle HTML lists properly. Something about babies and bathwater springs to mind!

    Google Docs, BTW, suffers from the same problem as Seamonkey, creating ul/ul instead of ul/li/ul, etc. It didn’t ought to be that difficult! There are trickier decisions that WYSIWYG HTML editors seem to make sanely.

    For me HTML paras, lists, blockquotes and tables (and a few other things, like ) offer enough basic logical and semantic structure for the bulk of stuff I write. (I used to like asWedit, then Netscape/Seamonkey, till GDocs came along – fixing their eccentricities by hand. Better that than wrestle with whatever weird labyrinths Word wanted to lead me down.) I thought the idea of Scholarly HTML was to add the academic extras to that basic, sound structure.

    Like you suggest, with a bit of XSLT etc it’s not hard to transform our p[@class=’simulated-li bullet-type-0′] into a ul/li – but how unnecessarily verbose is that when HTML provides exactly what we want?!

    I think what I’m trying to say is, isn’t it better to encourage developers a bit more, to do lists the sane and sensible way, rather than throwing in the towel and accepting some ugly workaround?

    • http://ptsefton.com ptsefton

      @richard I think this list-editing thing deserves yet another post, but just quickly now. I may have been wrong to say that it is OK for wave to not use lists. Here are some things that a widget/robot could do easily if lists were there.
      Turn a list into:

      a menu
      a diagram
      A collapsed list with (+) buttons to expand it
      A task list with checkboxes

      Easy to do if the nesting is explicit, very hard if you have to write your own code to infer structure.

      So – we do need to encourage the authors of these tools to think about supporting proper structured HTML. I will work on some thoughts about how to do this with a simple toolbar like the one in Wave.

  • http://dablog.ulcc.ac.uk/ RIchard M. Davis

    Ha ha! If I close the <q> tag I inserted above will it fix the rampant “quotes”! (Got to love HTML!! ;)

  • Pingback: Science in the Open » Blog Archive » The triumph of document layout and the demise of Google Wave()