ptsefton.github.io

[Last post](http://ptsefton.com/blog/2005/11/02/more_about_ice_and_why_xhtml_is_important)I replied to [Peter Albion's comments](http://edux.usq.edu.au/~albion/weblog/?p=117) on a previous post about why I obsess about XHTML export. That post was mostly in defense of the ICE application. Helpfully, [Ian Barnes of ANU posted a nice summary](http://www.bloglines.com/blog/barnes?id=17)of one of the really good reasons to worry about XHTML export: **sustainability**. If you care about the long-term viability of documents then worrying about export to a standard XML format is essential, to save being locked in to an unreadable format. Ian explains sustainability well, so I'll just add that people who have used styles consistently in the past now, as Ian puts it “**get sustainability for free**”. I [wrote about this](http://ptsefton.com/blog/2005/03/02/use_styles) a while ago, pointing out that the real value is in having consistent structure that can be mapped to other structures: > What has happened in the last ten years is the word processors I work > with (Microsoft Word and now OpenOffice.org Writer) have made it > easier and easier to get XML out and back in again with no major > innovations in the way they work. Although Word is clearly getting > worse. > Using styles has been essential. Documents created in 1997 at > Standards Australia would slot straight into the work I've been doing > on the Word Processor Interoperability Project with a few simple > style-replacements. > The message here is that it is consistent styles that are the most > help, not 'doing XML' (yes, yes exporting to XML is an essential > insurance policy and a requirement for any system). Standards > Australia's strategy of keeping the Standards in Word format, using it > to render them, then 'siphoning off' XML for web output has worked out > well, but only because of the styles. Another key issue is **usability**. In conversation Peter Albion has wondered if maybe we should not bother with HTML and just use PDF. But PDF Files for online Reading is usability the number two mistake in the [Top Ten Mistakes in Web Design](http://www.useit.com/alertbox/9605.html) (according to usability expert Jacob Nielsen). > Users hate coming across a PDF file while browsing, because it breaks > their flow. Even simple things like printing or saving documents are > difficult because standard browser commands don't work. Layouts are > often optimized for a sheet of paper, which rarely matches the size of > the user's browser window. Bye-bye smooth scrolling. Hello tiny fonts. > Worst of all, PDF is an undifferentiated blob of content that's hard > to navigate. > PDF is great for printing and for distributing manuals and other big > documents that need to be printed. Reserve it for this purpose and > convert any information that needs to be browsed or read on the screen > into real web pages. > <http://www.useit.com/alertbox/9605.html> So, I ask again, if we can painlessly make courses that allow **HTML for online reading**, so people can scan and find the bits they are interested in and follow links that will work, **and have PDF** for printing why wouldn't we write some cost-effective software to make that happen? Peter Albion's closing paragraph questions the point of all this work I've been doing (I think I've answered all his points in the last couple of posts): > So, given the pain that seems to be involved, is the answer to the > original question [Why do I keep going on about HTML export from word > processors?] a matter of satisfaction at bending the machine to the > will of a “hard master” (Sherry Turkle in The Second Self) or of > masochism? It doesn’t seem to be a matter of need. Word produces > respectable PDF and, when I need or want (X)HTML, there are adequate > tools available for that > too.<http://edux.usq.edu.au/~albion/weblog/?p=117> I'm not sure where Peter gets this notion that I'm **suffering pain** in this process. Certainly my team and I do things that others might choose not to, but I for one am in this line of work because I find it rewarding, even financially (I've had well -paying jobs in this field). Most rewarding for me is not writing programs, but in fixing situations that are obviously broken with new processes that work around the limitations and in-built stupidity of mass market software like Word and Writer. Hang on, why am I engaging with this name-calling? Back to the matter under discussion, Peter seems to be implying that you can: 1. easily generate PDF from word processors which is true, and 2. that there are adequate tools for making XHTML, which is maybe true, depending on who you are. But what about **doing both HTML and PDF**, really well, from the same source? It's not clear that he's saying that there are tools for that. I'm asking – what are good tools for that? Ones that might have a hope of **broad uptake in a university context**?. If someone else would help complete this stuff I would move on to something else. And it seems that the [work Ian Barnes is doing](http://www.bloglines.com/blog/barnes?id=17) may help here: > Personally I think a well-designed, standardised, structured format > like DocBook is even better. That's what I'm aiming at in my work at > ANU for the Australian Partnership for Sustainable Repositories. I'm > using Peter's ICE template as the starting point, and converting the > resulting documents into DocBook, which can be put under good version > control, transferred to any platform, converted using pre-existing > software into good HTML and ever-improving PDF. > <http://www.bloglines.com/blog/barnes?id=17> If Ian's software can help us hook ICE documents into the established set of tools for DocBook, with ways of generating HTML and PDF and so on then that's great – I'll stop playing with HTML export and move on.