[ptsefton.com] | [CV & Bio]

Challenge: Produce XHTML and print from a Writer document

2007-07-03

View this page as PDF

I'm working on a paper about how we use styles in the ICE system to structure and format documents for print and the web. I posted earlier about how I went trying to format a document for HTML using Word 2003 on Windows, which is the latest version I had to hand.

Now it's time to try with OpenOffice.org Writer, actually I'm using the latest NeoOffice for the Mac but it's essentially the same thing. I wrote about this some time ago when I got cranky about the HTML export in OpenOffice.org. Back then, in 2005, I had awful trouble with trying to use the list styles that come with Writer, the word processor.

The challenge is to take my recent paper, An integrated approach to preparing, publishing, presenting and preserving theses for the ETD conference and see if I can make a decent HTML document from it.

Thought I'd try a different approach this time, and create a new HTML document in Writer. In that mode there's a weird thing that you can look at the HTML source, but once you do so you can't switch back to any other view so you have to close the document and reopen it. And it turns out that if you work in HTML mode you don't get headers and footers. Solution is to create Writer document, then save as HTML and the headers and footers remain. You'd probably just want to do a Save as HTML at the end as an export anyway.

The first bit went better than Word. I was able to format the first chunk of my document using a Heading 1, some default text and a style called Quotations not sure why that's plural, but the resulting HTML is ugly but not too bad.

It has a nice clean heading and some paragraphs:

Introduction

This paper describes progress made

on a project funded by the Australian government to create a free

(as in open source) software application and associated

documentation. The project is known as the Integrated Content

Environment for research and scholarship or ICE-RS. The

project is tasked with creating and/or documenting software and

work practices that allow academics and students writing-up

research to create documents, collaborate, manage, publish and

deposit their work in repositories. An overview of the project,

derived from the successful proposal document is available on the

ICE website (Sefton 2006b) .

And blockquotes. It's not XHTML, but it could presumably be transformed into XHTML pretty easily.

In the institutional repository world, the Adobe PDF

format is currently the expected norm for document

delivery.

Even though institutional repositories are web-

based systems most content is not available in the native web

format, HTML. HTML is more usable and flexible than PDF in many situations, allowing users to skim and sample content more

easily that PDF. PDF, on the other hand, is a good solution

for printing long documents and can be configured to make

reading even book-length content a comfortable

experience.

So score one for Writer.

Now, I've said before that I can't figure out how to use the built in list styles in Writer, and today they make no more sense to me that they did nearly two years ago, so I decided to go with the format-only approach.

Here's the target text:

graphics4

Using the lists in Writer is still a surreal experience, partly because the bullets and numbering toolbar comes and goes as you click in and out of a list.

I figured out how to get a list to look roughly right on the screen.

  1. You can make a list item by clicking on the number button:

    graphics1

    Once you've done that, though I'm not sure how you're meant to add paragraphs to the first item, so they are indented underneath it.

  2. One thing I tried was Insert Unnumbered Entry, the button that looks like this:

    graphics2,

    But you have to click that to make an entry then copy-paste text into it, you can't use it like most other buttons to format things.

The result, as of when I gave up in disgust, in Firefox is this:

graphics3

The first list item (1.) is the result of a lot of laborious clicking of Insert Unnumbered Entry, and dragging in text, I have no idea how the second item (also 1.) can be made to number correctly. The follow-on paragraph in the second item was put there using the indent button, it looks OK in Writer but not in Firefox.

The resulting HTML is a bloody disgrace. The structure starts with a one-item list.

  1. Initially the

    handle will resolve to the server-side ICE repository, which because

    it is in the Subversion system is web-addressable, although usually

    authentication will be required.

Which is followed up by a no-item list. That's right, a list with no list items. At least it got the preformatted tag right, but I'm not quite sure that the two <FONT> elements are doing in there. I'm also not sure what this western bit is but that's cool cos I'm in Toowoomba. We might be part of South East Queensland, but we're west of Brisbane.

    The author need not

    worry about handles at all: they can use links in the usual way to

    manage their content and the system will manage the creation and

    management of handles when content is exported from the system.

    ...

    <pre class="western" style="margin-bottom:

    0.5cm;">http://localhost:8000/some-path

    When

    exported it

    would use a handle resolver:

Conclusion

This sux.

Which is the same verdict as for Word 2003.

Not usable out of the box for writing a technical paper that needs to be delivered in print and web formats. Writer does make PDF, though, which is a plus.