ReStructuredText tables and doctests

5 October 2008

As regular readers will know, I often share one file Python scripts here. Recently I wrote table.py, a simple module for creating reStructuredText tables in Python. It is available from my code section. Or you can go to the direct download. Save it as table.py somewhere you can find it!

As always, constructive comments, suggestions and improvements are very welcome.

The way it works is that you put the data you want into a matrix-like structure, i.e. a list or tuple containing a number of lists or tuples. So each list/tuple forms a table row, each item inside the list/tuple forms a cell.

So as an example, say we want to make a table showing the expansion of the European Union. We start with a matrix (i.e. tuple of tuples):

>>> eu_enlargement = (("Name", "Year", "Members"), ("ECSC", "1958", "6"),
... ("EC", "1973", "9"), ("EC", "1981", "10"), ("EC", "1986", "12"),
... ("EU", "1995", "15"), ("EU", "2004", "25"), ("EU", "2007", "27"))

We import the Table class from the table.py module:

>>> from table import Table

We make an instance of table, giving the matrix as the argument:

>>> eu_table = Table(data = eu_enlargement)

Or to be less verbose:

>>> eu_table = Table(eu_enlargement)

Lastly, we can print out the table in a reStructuredText table:

>>> print eu_table.create_table()

Which outputs the following table:

+------+------+---------+
| Name | Year | Members |
+======+======+=========+
| ECSC | 1958 | 6       |
+------+------+---------+
| EC   | 1973 | 9       |
+------+------+---------+
| EC   | 1981 | 10      |
+------+------+---------+
| EC   | 1986 | 12      |
+------+------+---------+
| EU   | 1995 | 15      |
+------+------+---------+
| EU   | 2004 | 25      |
+------+------+---------+
| EU   | 2007 | 27      |
+------+------+---------+

Which when rendered to HTML, looks like this:

Name Year Members
ECSC 1958 6
EC 1973 9
EC 1981 10
EC 1986 12
EU 1995 15
EU 2004 25
EU 2007 27

Very simple stuff, but handy if you are a reStructuredText fan.

Doctest

I used table.py to finally wield the doctest module in anger. doctest is a standard library test module.

doctest is one of those things that is harder to explain than to just do.

Start by running the following command at the shell:

python table.py

This should hopefully do nothing. This is because all of the tests have passed.

Now run:

python table.py -v

If you compare the output with the code of table.py, you should see what is going on.

The idea is you put interactive examples into the docstrings of the module, which not only help to document the module, but also provide something that can be automatically checked.

So to write tests using doctest, you simple use the module at the shell, as we did in the first half of this post, and then copy everything in the shell into the docstrings.

The one complication is that if one of your commands creates a blank line, then you need to put <BLANKLINE> in the blank line in your docstring.

1 David Jones says...

Some matters of style, some matters of taste:

Have you considered 'Name Year Members'.split() instead of ("Name", "Year", "Members")? It seems to split Python programmers into "love it"/"hate it". I like it and so does Raymond Hettinger.

The matrix version of that would be:

map(lambda x: x.split(), ['Name Year Members',
'ECSC 1958 6', 'EC 1973 9', 'EC 1981 10', 'EC 1986 12',
'EU 1995 15', 'EU 2004 25', 'EU 2007 27'])

Hmm, what a shame split isn't a builtin.

You often use an auxiliary variable to maintain an index count whilst looping though a list:

i = 0
for cell in row:
    if len(cell) > self.widths[i]:
        self.widths[i] = len(cell)
    i += 1

use for i,cell in enumerate(row): instead.

And another thing... if len(cell) > self.widths[i]: self.widths[i] = len(cell) is better written as self.widths[i] = max(self.widths[i], len(cell))

So that's:

for i,cell in enumerate(row):
    self.widths[i] = max(self.widths[i], len(cell))

More in a bit.

Posted at 5:30 p.m. on October 7, 2008


2 David Jones says...

div_tup = tuple([((width + 2) * line) for width in self.widths])

What version of Python do you use? Since we got generator expressions (2.4) almost everything that takes a list takes a generator expression (an iterator) instead. That means you don't need to create an intermediate list with «[ some comprehension ]». You can just drop the square brackets and go:

tuple(((width + 2) * line) for width in self.widths)

I would also recommend dropping the brackets around the expression to the left of "for"; the "for" keyword is strong hint (to you and the parser) that it's a generator expression:

tuple((width + 2) * line for width in self.widths)

You ought to decide whether to write «'string' * number» or «number * 'string'»; you use both and switch between them. I wish PEP 8 said something about this. Personally I use the first one, with number on the right. I've no idea why, I just picked one and stuck with it.

Surrounding code:

div_sub = "+%s" * len(self.widths) + "+\\n"

div_tup = tuple([((width + 2) * line) for width in self.widths])

return div_sub % div_tup

Ahhh.. using string formatting to join lots of strings together. The alarm bell is that you're constructing a format string dynamically. It's not always a mistake, but often is. In this case, use '+'.join(...). You can even get rid of the intermediate tuple:

return '+' + '+'.join((width + 2) * line for width in self.widths) + '+\\n'

It's a shame you couldn't come to the code clinic at PyCon UK that Raymond Hettinger (and me, but Raymond did all the heavy lifting) did. It was lots of fun and I think you would've liked it.

Posted at 5:55 p.m. on October 7, 2008


3 David Jones says...

Hmm. Those we're long comments. Maybe I should've written a blog article instead.

Posted at 5:57 p.m. on October 7, 2008


4 Zeth says...

Hi David,

Thanks again for those really useful comments, I will study them carefully. On the code clinic, I heard a lot of people say it was fantastic. I was pretty involved in set up tasks that day, though I did manage to grab a little of the generator tutorial. How about you hold it again next year?

What version of Python do you use?

I have tended to target Python 2.3 and above. I probably end up using 2.4 or 2.5 features, but Iusually try to make it work on 2.3 when required, because that is what the last version of Redhat and the previous version of OS X (10.4 Tiger) shipped with.

The dependencies I tend to use also tend to require 2.3 and above, so there is no point going below that and supporting Python 2.2.

Perhaps that is outdated and I should try out more modern features, but there are a lot of Redhat and Solaris servers out there with Python 2.3.

use an auxiliary variable to maintain an index count whilst looping though a list

On the i = 0, you caught me there. Being mostly self taught and having started with the BBC Micro and Amstrad CPC, this following quote from the Dutch computer scientist Edsger Dijkstra is probably true!:

It is practically impossible to teach good programming style to students that have had prior exposure to BASIC; as potential programmers they are mentally mutilated beyond hope of regeneration.

I try not to use integer counts when unpacking, unless I do care how many items I have unpacked. I will try out enumerate, as I have never seen it before.

Posted at 7:18 p.m. on October 7, 2008


5 David Jones says...

I think the BBC Micro provides a fine start in life.

I sympathise with you on the Python 2.3 thing. That would normally be my "must run on" version, but for our recent www.clearclimatecode.org project we chose Python 2.4 (more or less by accident), and it has proved annoying when we tried to use AIX 6.1 for which only Python 2.3 binaries are easily available.

Posted at 10:29 a.m. on October 8, 2008


6 jmu says...

Great code snippet - I think I'd find some usage for that one. I wonder what is the license of your code? MIT?

Posted at 3:57 p.m. on November 5, 2008


7 Keegan Carruthers-Smith says...

@David Jones:

You can do

from string import split

instead of lambda uglyness

Posted at 11:03 a.m. on January 10, 2009


8 Marcos di Silva says...

Hello, Zeth.

Congratulations for the script. It's very helpful. I'd like to report a bug. I'm trying to create tables with special characters, and the output is not OK. I modified the script using unicode to calculate the right length of cell. Look at these lines, around line 194 in table.py

for cell in row:
# Calculate the required spacing for each cell row_space.append((self.widths[i] - len(unicode(cell, 'utf-8'))) * " ") # Add the cell data and the space to the row list row_list.append(cell) row_list.append(row_space[i]) i += 1

Your original code is:

row_space.append((self.widths[i] - len(cell)) * " ")

Posted at 12:51 p.m. on June 24, 2010


9 foo says...

Posted at 12:53 p.m. on June 24, 2010


What do you have to say?

Show Editing Help

About

Hello, my name is Zeth, I'll be your host here.

Command Line Warriors is about taking control of your own technology, it looks at our experiences of computing; especially using GNU/Linux, the Python programming language, the command-line and issues such as techno-ethics, best practices and whatever is cool now. If you take control of your technology then you are a Warrior too!

This site is your site too which means that you can contribute and get involved. You can leave comments using the facility provided. For me, the comments and discussions are by far the best part of the site. So please do have your say!

Latest Discussions

Cupcake

July 31, 2010
Good post! You helped me a lot with my school project! CountryField(blank = True) < (K)
Countries in Django

LeshaShampoo

July 30, 2010
it was very interesting to read commandline.org.uk I want to quote your post in my blog. It can? And you et an account on Twitter?
Email Syntax Check in Python

vemma2018

July 30, 2010
I find myself coming to your blog more and more often to the point where my visits are almost daily now!
On Comment Spam

layecenda

July 30, 2010
Hello. And Bye.test :) http://idfjhvihdfiphvlajbvhalibv.com
PuTTY Series: Adding PuTTY to your system path

scuba

July 30, 2010
I’ve been visiting your blog for a while now and I always find a gem in your new posts. Thanks for sharing.
On Comment Spam

Businesking

July 30, 2010
Great site and articles for hack for win, I said Amazing post
How not to program WSGI

Tehnoking

July 30, 2010
This is Great post to learn about the hack Thumbs-up for you :D
How not to program WSGI

Syabiltech

July 30, 2010
I think this articles for master...because very hard to learning, As blogger beginners like me.
How not to program WSGI

coffeeatea

July 30, 2010
Are you looking for coffee gifts? We can tell you more about the coffee gifts including coffee machines and coffee pods.
Introducing Soturi - yet another Django blog application

noni juice

July 30, 2010
I just sent this post to a bunch of my friends as I agree with most of what you’re saying here and the way you’ve presented it is awesome.
On Comment Spam

Dion Moult

July 29, 2010
What I do know is that ever since I tried out Opera and put their tab bar on the left as a column, I've loved that layout. Back on Firefox ...
We need a thoughout integration of the desktop and the web - not Tab Candy superfast jellyfish

ZonaEntertainment

July 29, 2010
Wow useful articles, I'm read to learn about this and now I bookmark this to my Facebook, thanks for share!
How not to program WSGI

Giacomo

July 29, 2010
Honestly, I think both Mozilla and you are wrong :) This sort of concept adds overhead. A user would have to manage all this crap, constantly dragging and dropping, creating ...
We need a thoughout integration of the desktop and the web - not Tab Candy superfast jellyfish

Matija "hook" Šuklje

July 29, 2010
As a minimalist, you'll probybly moan if I mention KDE, but I'll do so anyway ;) The future I want (and actually see slowly fold out before me) is to ...
We need a thoughout integration of the desktop and the web - not Tab Candy superfast jellyfish

tahitian noni

July 28, 2010
Thank You For This Blog, was added to my bookmarks.
On Comment Spam

Rick

July 28, 2010
I already have piles. It's called A New Window.
We need a thoughout integration of the desktop and the web - not Tab Candy superfast jellyfish

Tech News

July 25, 2010
Thanks for this short tutorial...was auto-FTPing my files from my appserver to webserver for my tech news website. Everything was OK until someone hacked it. Hosting provider is now recommending ...
SFTP in Python: Really Simple SSH

naypalm

July 24, 2010
During the past 3-4 years, I and many others have enjoyed unlimited 2G/3G internet. But ever since the massive cult-like following of i Phone users in the US, most cellular ...
Calling time on mobile internet nonsense?

Steve

July 15, 2010
Very occasionally, you will run into a Java program that uses a lot of memory just to hold all the classes used. It turns out that the JVM uses a ...
Three classic command line tips

no

July 14, 2010
1. number one 2. number two 4. number four 3. number three 6. number six # first # second ## second-ay ## second-bee ### second-bee-one ### second-bee-two
An Introduction to ReStructuredText