Challenge: Produce XHTML and print from a Writer document
2007-07-03
I'm working on a paper about how we use styles in the ICE system to structure and format documents for print and the web. I posted earlier about how I went trying to format a document for HTML using Word 2003 on Windows, which is the latest version I had to hand.
Now it's time to try with OpenOffice.org Writer, actually I'm using the latest NeoOffice for the Mac but it's essentially the same thing. I wrote about this some time ago when I got cranky about the HTML export in OpenOffice.org. Back then, in 2005, I had awful trouble with trying to use the list styles that come with Writer, the word processor.
The challenge is to take my recent paper, An integrated approach to preparing, publishing, presenting and preserving theses for the ETD conference and see if I can make a decent HTML document from it.
Thought I'd try a different approach this time, and create a new HTML document in Writer. In that mode there's a weird thing that you can look at the HTML source, but once you do so you can't switch back to any other view so you have to close the document and reopen it. And it turns out that if you work in HTML mode you don't get headers and footers. Solution is to create Writer document, then save as HTML and the headers and footers remain. You'd probably just want to do a Save as HTML at the end as an export anyway.
The first bit went better than Word. I was able to format the first
chunk of my document using a Heading 1
, some default text and a style
called Quotations
– not sure why
that's plural, but the resulting HTML is ugly but not too bad.
It has a nice clean heading and some paragraphs:
Introduction
This paper describes progress made
on a project funded by the Australian government to create a free
(as in open source) software application and associated
documentation. The project is known as the Integrated Content
Environment for research and scholarship or ICE-RS. The
project is tasked with creating and/or documenting software and
work practices that allow academics and students writing-up
research to create documents, collaborate, manage, publish and
deposit their work in repositories. An overview of the project,
derived from the successful proposal document is available on the
ICE website (Sefton 2006b) .
And blockquotes. It's not XHTML, but it could presumably be transformed into XHTML pretty easily.
In the institutional repository world, the Adobe PDFformat is currently the expected norm for document
delivery.
Even though institutional repositories are web-based systems most content is not available in the native web
format, HTML. HTML is more usable and flexible than PDF in many situations, allowing users to skim and sample content more
easily that PDF. PDF, on the other hand, is a good solution
for printing long documents and can be configured to make
reading even book-length content a comfortable
experience.
So score one for Writer.
Now, I've said before that I can't figure out how to use the built in list styles in Writer, and today they make no more sense to me that they did nearly two years ago, so I decided to go with the format-only approach.
Here's the target text:
Using the lists in Writer is still a surreal experience, partly because the bullets and numbering toolbar comes and goes as you click in and out of a list.
I figured out how to get a list to look roughly right on the screen.
-
You can make a list item by clicking on the number button:
Once you've done that, though I'm not sure how you're meant to add paragraphs to the first item, so they are indented underneath it.
-
One thing I tried was Insert Unnumbered Entry, the button that looks like this:
,
But you have to click that to make an entry then copy-paste text into it, you can't use it like most other buttons to format things.
The result, as of when I gave up in disgust, in Firefox is this:
The first list item (1.) is the result of a lot of laborious clicking of Insert Unnumbered Entry, and dragging in text, I have no idea how the second item (also 1.) can be made to number correctly. The follow-on paragraph in the second item was put there using the indent button, it looks OK in Writer but not in Firefox.
The resulting HTML is a bloody disgrace. The structure starts with a one-item list.
Initially the
handle will resolve to the server-side ICE repository, which because
it is in the Subversion system is web-addressable, although usually
authentication will be required.
Which is followed up by a no-item list. That's right, a list with no
list items. At least it got the preformatted tag right, but I'm not
quite sure that the two <FONT>
elements are doing in there. I'm also
not sure what this “western” bit is but that's cool cos I'm in
Toowoomba. We might be part of South East Queensland, but we're west of
Brisbane.
The author need not
worry about handles at all: they can use links in the usual way to
manage their content and the system will manage the creation and
management of handles when content is exported from the system.
...
<pre class="western" style="margin-bottom:
0.5cm;">http://localhost:8000/some-path
When
exported it
would use a handle resolver:
Conclusion
This sux.
Which is the same verdict as for Word 2003.
Not usable out of the box for writing a technical paper that needs to be delivered in print and web formats. Writer does make PDF, though, which is a plus.