Dalke Scientific Software: More science. Less time. Products

SWIG

There are often two parts to writing a Python extension which interfaces to an existing C library. The easiest is a thin layer mapping Python functions into C functions. This level is very mechanical, and mostly converting between Python and C types. It's so mechanical and boring that various people have developed code generators which automate most of the process. The most widely used is David Beazley's SWIG. It understands most C and some C++ constructs and generates wrapper interfaces to access C/C++ functions, classes, constants, and so forth from Python, Perl, Tcl, C#, Java, Guile and other languages.

SWIG is not the only such project. SIP, which is part of the PyQt project from Riverbank Computing. It understands C++ better than SWIG; it can handle polymorphic functions and namesapces and have Python objects subclass from C++, among others features. Another approach for C++ is Boost.Python, which uses C++'s metaprogramming to have the compiler itself handle most of the interface code. I know there are others but I'm off the net right now. More recent projects often use gccxml which uses gcc to generate the list of objects which need bindings. SWIG and SIP use parsers which don't understand all of C++. As I recall the GNU folks put up a big resistance to gccxml because it means non-free compilers could use the gcc tools as the front end without need for the GNU licensing models.

SWIG, while popular, seems to be on the wane. In part because it does what it needs to do well enough, with no big need for updates. In part because other techniques, like ctypes, are taking over. My PyDaylight project uses SWIG and it was hard to get everything working correctly. Were I to start from scratch now I would use ctypes, despite the somewhat higher calling overhead.

I'm going to show how to use SWIG to write an interface to the open, write and close methods in the standard C library. These are the low-level interfaces used to build the fopen, fwrite and fclose functions. With open you create a file handle, which is an integer. The write function takes the handle and the text to write, and close closes the handle.

Here is the SWIG interface file, named "file_io.i"

/* specify the name of this module; or use the command-line option */
%module file_io

%{
/* SWIG inserts this section into the generated wrapper */

/* Get the prototypes for open, write, and close */
#include <fcntl.h>

%}

/* SWIG needs this information to convert to/from Python ints */
typedef unsigned short   mode_t;
typedef signed int    ssize_t;
typedef unsigned int    size_t;

/* This means "whenever a 'const void*' is followed by a 'size_t' */
/* then interpret it as a string with the given length           */
/*    (NOTE: size_t is unsigned while int is signed.  This will */
/*     cause a problem with string length > 2**31)             */
%apply (char *STRING, int LENGTH) { (const void *, size_t)}


/* Here are the functions, copied verbatim from the include files */
int open(const char *path, int flags, mode_t mode);
ssize_t  write(int, const void *, size_t);
int      close(int);


/* Some constants for the open flags */
/* The left-hand side of the equation describes the symbol */
/* The right-hand side describes how to get it from the C code */

%constant int O_RDONLY = O_RDONLY;
%constant int O_WRONLY = O_WRONLY;
%constant int O_RDWR = O_RDWR;
%constant int O_APPEND = O_APPEND;
%constant int O_CREAT = O_CREAT;

One thing to note: while the above looks short it took me quite some time to get it working. The hardest part is getting the %apply working, because I still don't fully understand template matching. Also tricky was getting the syntax for everything. The ";" at the end of the %constant lines is important but leaving it out silently ignores the line. It's also annoying having to figure out the actual C data types for the different typedefs (size_t, mode_t, etc.) That's where the gccxml approach could help a lot.

To convert generate the interface code do

swig -python file_io.i
This generated the files "file_io_wrap.c" and "file_io.py". The first will create shared library named "_file_io.so". By convention the leading "_" means it should not be imported directly. Instead, import the file_io.py file. This will import the right symbols from _file_io.so and configure a few things needed for more complicated circumstances.

Compilation, as usual, is through setup.py. Note how I named the extension "_file_io".

from distutils.core import setup, Extension

setup(name="file_io", version="0.0",
	ext_modules = [Extension("_file_io", ["file_io_wrap.c"])])
Compile with "python setup.py build" and make the symbolic link to get "_file_io.so" in the local directory.

Here's a test program

import file_io

print "open",  file_io.open("spam", file_io.O_CREAT|file_io.O_WRONLY, 0666)
file_io.write(3, "Hello!\n")
file_io.close(3)

I mentioned earlier there are often two parts when writing a C extension to Python. This was the easy part - making the functionality available to Python. The next and often more complicated step is making the interface "pythonic." That is, making it feel like native Python code. This can mean converting functions which do property lookups in C into attributes in Python, or doing automatic garbage collection, or supporting Python's iteration protocol.

Here is an example of what I mean. The following is a beginning attempt at making the file_io interface more Pythonic by making the result act more like a writeable file object.

import file_io

# emulate Python's "open" command
def my_open(filename, mode):
  flags = 0
  for c in mode:
    if c == "r": flags |= file_io.O_RDONLY
    elif c == "w": flags |= (file_io.O_WRONLY|file_io.O_CREAT)
    elif c == "+": flags |= file_io.O_APPEND
    else: raise TypeError("unknown mode character %r in %r" % (c, mode))

  h = file_io.open(filename, flags, 0666)
  if h == -1:
    # should get the error message from errno
    raise IOError("cannot my_open %r" % (filename,))

  return MyFile(h)

# emulate part of a file object
class MyFile(object):
  def __init__(self, h):
    self.h = h
  def write(self, s):
    file_io.write(self.h, s)
  def close(self):
    if self.h != -1:
      file_io.close(self.h)
      self.h = -1
  def __del__(self):
    if self.h != -1:
        self.close()

def test():
  f = my_open("spam", "w")
  f.write("Are you there?\n")

if __name__ == "__main__":
  test()

In Python and many of the other high-level dynamic languages people talk about "duck typing". That is, "if it looks like a duck and acts like a duck then don't worry about checking if it derives from the duck type." I can pass the above MyFile class to any any function which expects a write method which takes a string. I didn't have to have MyFile derive from some base file class.



Copyright © 2001-2013 Andrew Dalke Scientific AB