Headfirst into the Semantic Web
12 May 2010
OWL stands for Web Ontology Language. The W3 gives the following overview:
The OWL Web Ontology Language is designed for use by applications that need to process the content of information instead of just presenting information to humans. OWL facilitates greater machine interpretability of Web content than that supported by XML, RDF, and RDF Schema (RDF-S) by providing additional vocabulary along with a formal semantics
I am just getting into it and I do not really have a clue yet. An ontology is a representation of information domain (i.e. some subset of the world). An ontology contains 'things' which are called 'individuals' in OWL terminology, i.e. 'instances' in object-oriented terminology. 'Individuals' may be part of a class (like a class in object-oriented terminology). Individuals will also have relationships with each other (called properties). Of course like all these semantic web things, OWL goes on and on with lots of dense, never-ending formal specifications and so on.
I have plenty of data that I could expose using OWL, and then I want to try to parse the OWL. In one sense OWL is just an RDF/XML format, so one could generate the correct XML format using XQUERY or SAX or XLST. However, I thought I should find out if there are any libraries that would provide a firm foundation for using OWL.
There is a nice editor called Protégé-OWL that allows you to play with ontologies in a GUI. Like most things in the XML world, a lot of the most prominent libraries are in Java, such as 'Jena' which among other things provides an OWL API. So the first major option seems to be writing a little wrapper for that.
RDFLib is a Python library for working with RDF. This seems to be the big beast of the Python RDF world, and a lot of the other Python tools are based on it. I am not sure though if it supports OWL out of the box, I may have to do something on top to parse and serialize OWL. This is the second major option it seems.
However, I thought I would look into what else is around. I found the following libraries which so far I have taken no more than a passing look at.
- Django-RDF "is an RDF engine implemented in a generic, reusable Django app, providing complete RDF support to Django projects without requiring any modifications to existing framework or app source code, or incurring any performance penalty on existing control flow paths." Sounds great but doesn't seem to have ever been finished, and looks abandoned. However, probably worth looking into anyway.
- RDF Alchemy - As you may have guessed by the name, it is analogous to SQLAlchemy (a popular object relational mapper):"The goal of RDF Alchemy is to allow anyone who uses python to have a object type API access to an RDF Triplestore."
- FuXi - "A Python-based, bi-directional logical reasoning system for the semantic web"
- OWL Logic - claims to offer an API for OWL.
- OWL Sugar - is a new "library for manipulation of OWL documents using OO techniques in Python." Pretty early days apparently, but looks up my street.
- Sparta - Sparts is a simple, resource-centric API for RDF graphs, built on top of rdflib.
- TRAMP - Makes RDF look like Python data structures.
- Seth was a wrapper for a Java based software called Pellet. The approach looks really smart but sadly the wrapper is no longer under active development. I thought about using it and updating it as necessary but the source code does not seem to be complete, some of the .py files are missing - I could try to recover them from the .pyc files I suppose.
- Cwm - a general-purpose data processor for the semantic web. It is written by Tim Berners Lee of WWW fame. I haven't looked deeply into it but I am not entirely sure what is offers to me that RDFLib doesn't.
- G-Protege Ontology Management System - extends Protege OWL, and uses something called 'the data base G'. The last part puts me off somewhat, as I have quite enough databases already.
- SuRF - "a Python library for working with RDF data in an Object-Oriented way."
I have to learn a bit more about OWL and what I am trying to achieve in order to say anything more specific about the merits of any of these libraries.
If you know anything about OWL then please let me know.



1 karl says...
I would not recommend to start with OWL, but maybe it suits you :) The hardcore way.
There is a good book "Programming the Semantic Web, Build Flexible Applications with Graph Data" By Toby Segaran, Colin Evans, Jamie Taylor with a lot of python code into it.
You can also play with SKOS if you want to deal first with taxonomies http://www.w3.org/TR/2009/NOTE-skos-primer-20090818/ and understands some of the concepts of RDF.
There is the IRC #swig channel (Semantic Web Interest Group) on irc.freenode.net.
Posted at 11:52 a.m. on May 13, 2010
2 Krys says...
+1 for the book "Programming the Semantic Web". I'm reading it now and while I have noticed some bugs in the example code (at least via my Safari account), the book does a good job of taking your from normal database table data modelling up to semantic data modelling and then on to OWL ontologies, etc. It also goes a good job of explaining what benefits you get from these things.
Hope this helps!
Posted at 1:35 p.m. on May 13, 2010
3 nchauvat says...
CubicWeb <a href="http://www.cubicweb.org/'>eee</a> is a semantic web framework that knows about OWL and that I presented several years in a row at EuroPython.
Since you are developing the website for EuroPython, maybe you could learn about OWL by reading http://data.semanticweb.org/ and providing what EuroSciPy and fr.pycon.org provide
the spec is at http://ontoware.org/swrc/
Posted at 8:41 p.m. on May 13, 2010
4 Graham Higgins says...
Hi Zeth,
As the previous posters have noted, by choosing OWL as your starting point, you've set yourself something of a challenge - still, if you manage to make decent headway it should put you in a good position to survey the scene, so to speak.
I idly followed up on a couple of your pointers that I found personally interesting and you may be interested in my findings ... G-Protege is likely to turn out to be a dead end ultimately --- the plugin uses a Python pickle to mimic the G database but the G database itself has apparently vaporised - the vendor's domain is parked. Still, exploring it might still be useful as an educational exercise, just so long as you can take the knowledge gained and transfer it elsewhere.
The Python examples for Seth do function but only if you travel back in time about three years (so to speak) so that you are using contemporaneous JPype and Python resources (Pellet 1.3.2beta, etc.)
owlsugar is definitely still in development (repos was only created a coupla months ago) and it has been refactored since the its tests were written and they haven't yet been brought into synch with the new code, so a little intelligent tweaking is required to get it to actually work.
OWL Logic is, I'm afraid, apparently long gone, no code survives unless it's hidden away in a SF archive somewhere. You /can/ get hold of the original 2004-vintage "RDFEngine" code but it's from a Master's thesis and needs some (a lot of) tidying before it can be used (with Python<=2.5).
HTH.
Posted at 2:30 p.m. on May 14, 2010
5 Graham Higgins says...
No, I tell a lie, OWL Logic is still there in the svn repos:
http://eulersharp.svn.sourceforge.net/viewvc/eulersharp/trunk/2004/02swap/?sortby=date
Posted at 2:58 p.m. on May 14, 2010
6 Peter Robinson says...
As Zeth's partner, sort-of, in this enterprise, I can maybe say a bit. I'm responsible for starting with OWL, because that seemed the most precise and complete statement of RDF syntax, etc. I think that's still the case. Also, there's a robust XML implementation of OWL, of course, and that suits us as we know XML well and have lots of tools for working with it. I wasn't familiar with SKOS, which seems quite a nifty way of dealing with simpler datasets, or with existing well-defined data in (say) a database that you want to push into the semantic web as efficiently as you can. I'm not very taken with turtle, which SKOS seems to like, but I guess that's because I have got so used to XML. Actually, I think what we want to do is reasonably simple, essentially SPARQL retrieval on rather straight-forward queries -- or so I hope
Posted at 11:52 a.m. on May 20, 2010