COMMAND LINE WARRIORS

Taking Control of your Own Technology

Include ODF support in the Linux Standard Base?

26 March 2008

The whole world, me included, seems to have become obsessed by Microsoft and its only partially-open, only partially-XML, document format.

Document Freedom Day

Lets have a post off and ignore them and lets also ignore their users, the proprietary serfs, for a moment, because we have mountains of our own office software. Let's devote a post just to that, before getting back into the OOXML war. It is after all, Document Freedom Day!

So imagine I write a word processed document using Abiword, make a spreadsheet using Gnumeric, and draw a diagram using Dia.

Can these documents be opened without loss by my friend who uses Kword, Kspread and Kivio? What about another friend who uses OpenOffice Writer, Calc and Draw?

For those of us in the free world, we may have software freedom, but we still do not have the free exchange of information contained in our office documents.

Enter OpenDocument

It was for this kind of problem that the OpenDocument format was created. An open process began in 2002 when software creators and large users met and defined the standard, then the general public had plenty of time, years, for peer review and provide comments on the standard. Now several years on, ODF is tried and trusted technology, unlike certain half-baked Johnny-come latelys.

Software organisations included KDE (maintainers of KOffice), Sun (maintainers of OpenOffice), Corel (maintainers of WordPerfect), IBM (Lotus 1-2-3, Workplace), and Adobe (Framemaker, Distiller). Other interested parties included the Society of Biblical Literature, the National Archives of Australia, the New York State Office of the Attorney General, Novell, Boeing, Nokia and lots of others.

The general public were also made to feel included as equal partipants in the process, nothing was hidden behind password protection. The result of all this was that the main phase made a document format that can be useful to everybody.

There was also another phase where they tried to go beyond client based office software to look at the web. The OpenDocument format is quite clean XML, so the idea was that OpenDocument should also be a medium for data, information, and knowledge exchange. So OpenDocuments can be uploaded to servers and be parsed as XML and processed for different uses, while web servers and browsers could automatically render OpenDocument files back to humans as web pages. OpenDocument would bridge the information gap between thick-client office work and the web.

Enter the Linux Standard Base

The Linux Standard Base Desktop Specification provides a standard desktop for developers to target when writing desktop applications.

So for example, bitmapped images are expected to be in .PNG format, compressed photographic images in JPEG. Users with accessibility needs should be supported using the ATK ToolKit. Fonts are done using freetype2. Cryptographic features are implemented in OpenSSL. You get the idea.

The Desktop Specification and OpenDocument?

Putting the two together, should the Linux Standard Base Desktop Specification provide a specified standard for office documents? I.e. should the Linux Standard Base specify OpenDocument for office documents as it specifies .PNG for bitmaps?

As you may have guessed, I personally think it should. However, we need to be aware of the differences between the Desktop Specification and OpenDocument. The Desktop Specification follows the traditional Unix philosophy of writing applications as layers upon shared libraries.

At the moment, an application cannot say to a free operating system, 'here is some data, please save it as an OpenDocument file'. This because there is no default and official ODF library included by default in the distributions. I think there should be one, but there currently is not one.

ODF is implemented again and again by each ODF-supporting application, as well as in libraries for every major programming language.

So what I am wondering is, would it be possible for someone, or some company, to take a hunk of C or C++ from one of these programs and package it as "libodf"?

The other programs, and other programming language bindings could then be a lot simpler. If they wanted, they could then link to libodf rather than re- implementing ODF support each time.

Of course, something huge like OpenOffice would still probably want a lot more on top, but smaller, domain-specific programs, are the majority of the software industry. Say we write a business application that outputs data; a libodf might be an ideal way to output portable and reusable data, rather than creating yet another unique dump of internal data structures that locks in users.

Let me know what you think.

Discuss this post - Leave a comment

If you a member of Digg, please feel free to `go over to Digg entry and Digg it`_. If you are a member of Stumbleupon, please consider giving me the thumbs up.

1 Paula says...

Sure, ODF should be on the LSB... but what about the Ogg framework (Vorbis, Theora, etc)? Multimedia is content, too. Let's blog for all the more important free formats in the LSB!

Posted at 1:07 a.m. on March 26, 2008


2 Jonas says...

Agreed. It would be great if there was a lib for that and considered required. That way maybe KWord and OoWriter would be more compatible with eachother...a odf-file saved using OO sometimes look very odd in KWord and the other way around.

And speaking of Koffice...if I understood things correctly, they are working on getting it to work like that in KDE. Granted, it needs to be taken further so OO, Gnome, perl-scripts and what not can take advantage of it as well but it's a start.

Posted at 1:13 a.m. on March 26, 2008


3 Anonymous says...

Inclusion in the LSB does not imply that any notice will be taken of it. The LSB says that RPM is /the/ package format - yet that has not made Debian/Ubuntu/Slackware/Gentoo/Arch... drop their own and start using it (possibly because RPM is a worse format, but that's another rant...).

Posted at 3:50 p.m. on March 26, 2008


4 Andrew West says...

Anonymous,

LSB themselves from certifying non-RPM based distros as LSB certified. http://www.linux-foundation.org/en/LSB_Distribution_Status

Perhaps the difference is that, while RPM is the preferred format, it isn't required. (Thank god)

`http://refspecs.linux-foundation.org/LSB_3.2.0/LSB-Core-generic/LSB-Core- generic/swinstall.html`_

Docutils System Messages

System Message: ERROR/3 (<string>, line 9); backlink

Unknown target name: "http://refspecs.linux-foundation.org/lsb_3.2.0/lsb-core-generic/lsb-core- generic/swinstall.html".

Posted at 4:07 p.m. on March 26, 2008


5 Sten Solberg says...

Yes, yes, yes and yes.

We Linuxers take pride in our free choice of distros (and tweaks!), but too often forget/neglect that we are also free to cooperate more closely. A libodf is overdue.

Posted at 8:57 a.m. on March 27, 2008


6 Chris Lees says...

The idea of a "libodf" is a great one. And I think all programs should support saving into ODF, if applicable. I've often thought it dumb that other Linux-based office software can't write into ODF.

Including Ogg into the LSB is also a good idea, as long as it doesn't have functions specific to Theora. Theora, while free for Free Software use, is still patented.

Posted at 5:54 a.m. on March 28, 2008


7 AJS says...

I don't approve of LSB at all.

If you compile a package from source, it will just work, irrespective of wherever you put your libraries or what is in your execution path.

LSB is nothing more than an attempt to create an excuse for vendors to distribute code in binary form only. Ultimately, if this happens and a lot of software ends up being released in this way, it will mean a loss of freedom for everyone.

What would be really nice would be for somebody like Ubuntu to call a final halt on -dev packages and integrate the developers' files into the main package. Most people have the disk space to spare nowadays; it's no great loss if they don't ever go on to do things "by hand", but the ability to do so is just a bit closer if they want it.

Posted at 1:22 p.m. on March 29, 2008


8 Chris Barts says...

Compiling things by hand is only possible if you have the disk, RAM, and CPU to spare, not only for sources but for all of the toolchain (gcc, cpp, gas, ld, and so on) required by the source packages. That is not an option for many people and organizations.

Secondly, compiling things by hand means you abandon the package management system entirely and, therefore, lose an essential tool in the task of keeping your system running and upgradeable. I've tried to keep track of my own system entirely by hand and there are simply no advantages to it. Good, working package management systems (apt, for example) give me the system I want in a sustainable fashion.

Posted at 10:45 a.m. on March 30, 2008


9 Andy Loughran says...

I think the idea of libodf is also a way for the 'big boys' in odf to work together in order to allow the smaller fry to swim. If this library were cross-platform, it would mark an incredibly easy way for different office programs to collaborate. ISO could oversee libodf and make changes necessary. It would also mean no single vendor would have control over libodf, and individual programs would have a much harder time trying to hack binary blobs into the specification.

Posted at 2:29 p.m. on March 31, 2008


10 AJS says...

@ Chris Barts

I think you are seriously overestimating the requirements of the standard toolchain. When the first pre-compiled binary package management systems came out, the fastest processor speeds were a few hundred megahertz; and hard disk space was measured in megabytes, not gigabytes. Saving a few bytes here or there really could make a difference.

It's nice to have a package manager, but even huge distros such as Debian and Gentoo can't have everything in their package management systems -- and sometimes you want a version that is too recent to have been certified for inclusion. It's a fact of life that one day, sooner or later, you are going to have to build something from Source Code.

And when you do, it will be placed under /usr/local/ -- out of the way of your package manager -- unless you specify otherwise.

When you compile a package from Source Code, if it says it depends on foo, chances are it also depends on foo-dev as well. The main package only contains files essential for day-to-day running; files that are only necessary for development work are separated into the -dev package. But that is counter-intuitive: the homepage of project bar only says you need foo. It only makes the process appear arcane, esoteric and generally harder than it really is. The effect of this has been to turn many people off compiling from Source altogether.

It doesn't have to be that way. Compiling a package from Source Code is not hard. If you can spell "make install" you're already halfway there.

Dropping -dev packages will make main packages bigger, that's for sure; but my whole point is that the need to separate files only required for development from those required for day-to-day running no longer exists for most people. Anybody wanting to install stuff on a constrained system nowadays is in a minority, and probably knows what they are doing anyway.

Meanwhile, anybody who ever has to compile a package from Source -- which basically means pretty much everyone -- will find it a bit easier because the dependencies should already be satisfied.

Posted at 10:13 a.m. on April 5, 2008


11 Justin says...

Best idea I've heard in a while.

Posted at 3:46 a.m. on April 18, 2008


What do you have to say?


About

Hello, my name is Zeth, I'll be your host here.

Command Line Warriors is about taking control of your own technology, it looks at our experiences of computing; especially using GNU/Linux, the Python programming language, the command-line and issues such as techno-ethics, best practices and whatever is cool now. If you take control of your technology then you are a Warrior too!

This site is your site too which means that you can contribute and get involved. You can leave comments using the facility provided. For me, the comments and discussions are by far the best part of the site. So please do have your say!

Latest Discussions

Zeth

May 16, 2008
Hi guys, thanks for your comments. I deal with them in more depth in a post that I will publish shortly. However, one small thing; dbr, I really advice against inferring anything from this post. You are completely right that the hack is extremely convoluted and a crazy way to serve a text file. It is crazy because of me, not because of the software, but because I am bulldog and I won't let go until it is dead!: * I want the old archive to be easily available to humans but not search engines. * I could not use any existing URLs, because the archive cannot interfere with every incoming link going to the right place on the new site. * The last website was deployed in a really experimental way that would need a lot of work to unravel. * I wanted to do as little work on the old version of the site as possible, yet have it available with no loss of data or formatting. This blog has gone from Blogger to Wordpress to my owned hacked Pyblosxom to Django. No other formatted text has gone this way so it is not really something you can learn positive lessons from. Most other people would have used Apache so you get that to serve your robots file. When moving from Wordpress to Pyblosxom, I made it easy on myself by using a Wordpress style of pseudo-HTML as my markup format in Pyblosxom. If I had a time machine then I would have used something XML-compatible (such as ReStructuredText) instead. I still like Pyblosxom, it is a really nice way to get a simple blog up, and I still prefer it to any other pre-made blog application. But the way I want to develop the site going forward involves a bit more freedom, and for that I want a real web framework. Django is by far my favourite so it is natural that I use that.
How not to program WSGI

Andy Canfield

May 16, 2008
The Ubuntu 8.04 update was flawed, in that it included a terminal screen and user prompt to tell about replacing the keys. Using the Ubuntu standard update manager, there is no terminal, so the configuration of ssl just hangs. I had to update 2-3 times, running dpkg each time to unscamble the package database. apt-get probably would have worked OK.
Swap out your ssh keys

dbr

May 16, 2008
I concur with the other two comments - this is one of the nicer blog'y site layouts I've seen. The comment system is also actually pleasant to use, unlike every single other one I've (not)-used \\o/ One slight bug, you need to enter two backslashes to make it visible. Anyway.. This hack seems an extremely convoluted and bad way to serve a simple text file, does it not? I've not used Pyblosxom, but it seems insane that it doesn't allow you to serve up static files (specifically /robots.txt and /favicon.ico)..? As much as I now dislike PHP compared to Python, this does reaffirm my decision to stick with PHP for web-applications - no web-framework has gotten near the simplicity of shoving an index.php file in htdocs/
How not to program WSGI

Zeth

May 16, 2008
To Anonymous, I tried your script with some old SSH keys and it did not manage to break into an apparently vulnerable system. 1. The script requires a known username. My system did not allow root logins. 2. After failed three logins, the script's IP address got added to deny hosts.
Swap out your ssh keys

Zeth

May 16, 2008
To Anonymous, I said to do three things: 1. Accept the update. 2. Replace your keys. 3. Don't *have a panic attack about it.* And I still stand by that. Most non-technical users won't even be using openssh-server. While the update, blacklists and instructions on how to regenerate comes down automatically for those that do. Indeed, I think this episode shows how fast the free/open source community can move. Everytime the open source software has a panic attack over an in-theory, technically possible, but not actually being used, 'exploit', then proprietary software people say "Look their software is no better, it is just as insecure as ours". However, that is not true. There is a range of exploits, from theoretically possible with some serious preparation and knowledge about the target system, through to automated attacks that will work against any machine without the need for knowledge about it.
Swap out your ssh keys

Anonymous

May 15, 2008
Like stefano says, you are being VERY irresponsible by downplaying this as only "theoretically possible with a supercomputer". Linked on the page stefano mentioned is this: http://milw0rm.com/exploits/5622 That will break into your computer in a couple hours is you're using public-key logins, which are considered the safest kind, and are used on many, many machines that are supposed to be extra secure. This is a horrible, horrible problem, and dismissing it does nobody any favours. I'd really suggest you re-write this article to accurately portray how serious the problem is.
Swap out your ssh keys

Ryan

May 15, 2008
Yeah, good layout too. Very clear. :) Better than the last, in fact! I'm another python/django nerd, so I'll be listening even more now. I guess one of the things that's inspiring about Django is they're concerned pretty hardcore with security fixes. Just this week, an email came out and they released new sub-versions for each major Django release to include the fix. Very awesome. For your blog post model, what did you do for entering posts? Do you still use the default admin interface, or did you make your own views for posting and whatnot? I haven't looked into it much, but does django automatically include much in the way of wysiwyg text editors for text fields?
How not to program WSGI

stefano

May 15, 2008
Apparently the bug makes a brute-force attack much easier than "theoretically possible with a supercomputer". http://metasploit.com/users/hdm/tools/debian-openssl/ It looks that the buggy code used the process ID as seed for generating the key, and there might only be 32,768 process IDs. Furthermore not all process ID are equally possible and one could use a range of 1000-3000 seeds and having a very high chance of producing a valid key.
Swap out your ssh keys

Bug

May 15, 2008
@txwikinger: Thing is, I don't use Ubuntu and I can't remember where did I generate my key [I'm using Archlinux]. @Zeth: You should add the number of comments to the front page.
Swap out your ssh keys

Kennon

May 15, 2008
The openssh-blacklist debian package (now available, and required for the latest version of openssh-client and openssh-server) is now available. You should: apt-get update apt-get install openssh-blacklist apt-get upgrade After that you'll have the ssh-vulnkey utility and can check.
Swap out your ssh keys