ctypes

Python 2.5 includes the ctypes module. This is a "foreign function interface" library once available only as a third-party module. A FFI means that Python code can call C functions using only Python. Most other ways require writing some C code, such as writing Python extensions.

The ctypes module works on shared libraries. The standard C library has a set of functions (dlopen, dlsym, dlerror, dlclose) for loading a shared library and resolving symbols -- usually named functions and variables -- in it. C's datatypes are not directly compatible with Python's so a Python object cannot simply be passed to a C function. Instead, ctypes can use hints to convert from one to the other. It also supports C-style memory allocation and mapping to C structures, which can be useful if you need a very compact memory representation.

While the dl* calls are reasonably standardized, the names and locations of shared libraries are not. I have a Mac, which uses the ".dylib" extension for shared libraries used by the run-time loader and ".so" for other shared libraries, which most other Unix systems use ".so" and don't distinguish between the two types of shared libraries. MS Windows uses ".dll" and that's about all I know. I'll start with a simple example of using code from libc, which is the standard C library.

import ctypes

libc = ctypes.CDLL("libc.dylib", ctypes.RTLD_GLOBAL)
print libc.strlen("Hello!")

For Linux try

libc = ctypes.CDLL("libc.so")

I don't think Linux needs the RTLD_GLOBAL, which the Mac needs for reasons I don't fully understand.

This will print 6 to the screen, which is the number of characters in the string. Note that internally Python adds a terminal NUL character to all strings so internally Python reserves 7 bytes for the characters "H", "e", "l", "l", "o", "!" and "\0". That's why strlen worked as expected.

Here's a bit more complicated example

num_bytes = libc.printf("the number %d and string %s\n", 10, "spam")
print num_bytes, "bytes in the output"

with the output

the number 10 and string spam
30 bytes in the output

Result and argument types

The ctypes interface by default only handles Python integer, long, and string data types. If you use float or something else then you must tell ctypes how to convert the C function call arguments and result value.

I'll start with the result type. Ctypes expects that the result is an integer so interprets the bitvalue accordingly.

>>> print "'2.5' as a double is", libc.atof("2.5")
'2.5' as a double is 1
>>>

Frankly I can't figure out how it gets the "1". The double representation for 2.5 is '@\x04\x00\x00\x00\x00\x00\x00' so the integer value should be 1074003968 or 0, depending on the stack order. Perhaps there's an overflow because there is 8 bytes on the stack instead of 4.

To fix the problem, add the "restype" attribute to the function. This annotates the function (Python functions can have attribute just like classes can have attributes) so ctypes knows how to convert the response into a Python object.

>>> libc.atof.restype = ctypes.c_double
>>> print "'2.5' as a float is", libc.atof("2.5")
'2.5' as a float is 2.5
>>>

How about a function which takes floats as input? My favorite is atan2 from the math library. It takes two doubles and returns a double. The special function attribute "argtypes" takes a list describing the parameter types, like this:

>>> libc.atan2.argtypes = [ctypes.c_double, ctypes.c_double]
>>> libc.atan2.restype = ctypes.c_double
>>> libc.atan2(3.0, 4.0)
0.64350110879328437
>>> import math
>>> math.atan2(3.0, 4.0)
0.64350110879328437
>>>

See the ctypes documentation for more details on the available data types, including how to call with pointers.

Complex C structures

Ctypes lets you define your own data structures using the same physical layout as C does. You can use Python to build the data structure and pass the result into a C function. Even if you don't want to call a C function you might do this to save memory as Python data structures take somewhat more memory than C would. Ctypes tracks the created objects for memory management so there is still some overhead.

I'm going to call 'getpwnam' to get a password entry file by username. The C prototype for that is

struct passwd *
getpwnam(const char *login);

where the passwd struct is the rather complicated

struct passwd {
        char    *pw_name;       /* user name */
        char    *pw_passwd;     /* encrypted password */
        uid_t   pw_uid;         /* user uid */
        gid_t   pw_gid;         /* user gid */
        time_t  pw_change;      /* password change time */
        char    *pw_class;      /* user access class */
        char    *pw_gecos;      /* Honeywell login info */
        char    *pw_dir;        /* home directory */
        char    *pw_shell;      /* default shell */
        time_t  pw_expire;      /* account expiration */
        int     pw_fields;      /* internal: fields filled in */
};

I'll define this structure using ctypes

class PASSWD(ctypes.Structure):
    _fields_ = [("name", ctypes.c_char_p),
                ("passwd", ctypes.c_char_p),
                ("uid", ctypes.c_int),
                ("gid", ctypes.c_int),
                ("change", ctypes.c_long),
                ("class", ctypes.c_char_p),
                ("gecos", ctypes.c_char_p),
                ("dir", ctypes.c_char_p),
                ("shell", ctypes.c_char_p),
                ("expire", ctypes.c_long),
                ("fields", ctypes.c_int)   ]

The getpwnam function takes a string so I don't need to declare the argument types, but I'll do that anyway to ensure it's only called with a single string.

>>> libc.getpwnam.argtypes = [ctypes.c_char_p]
>>>

If I call it now, without setting the restype attribute, the result will be an integer

>>> libc.getpwnam("dalke")
6340464
>>>

I'll tell ctypes that the result of calling getpwnam is a pointer to the PASSWD data structure.

>>> libc.getpwnam.restype =  ctypes.POINTER(PASSWD)
>>> libc.getpwnam("dalke")

The "LP_PASSWD" type means "long pointer to PASSWORD". Because of the way C works, ctypes doesn't know if this is a single value, passed by pointer, or a pointer to an array of objects. It always assumes the latter so to get the actual password structure I'll get the first entry using [0] then get the attributes from that entry

>>> entry = libc.getpwnam("dalke")[0]
>>> entry.name
'dalke'
>>> entry.uid, entry.gid
(504, 20)
>>> entry.shell
'/bin/tcsh'
>>>

Calling Python from C

The above examples called C from Python. Sometimes you want C to call Python. A common example is a numerical integrator. It takes a function as one of its parameters and evaluates it at different points in the integration range.

Here's a Python numerical integrator using the rectangle rule. It evaluates the probe functions at 4 points in the range -1 to 1.

def integrate_rectangle(f):
  return (f(-0.75)+f(-0.25)+f(0.25)+f(0.75))/2.0

import math
def probe_function(x):
  return math.cos(x)*(x+1)

print repr(integrate_rectangle(probe_function))

The above prints "1.7006012905844656". The probe function is analytically integratable to (x+1)*sin(x)+cos(x). Evaluating gives 1.682941969615793 so the numerical integration is off by 1 percent.

To get the result I passed the "probe_function" to the "integrate_rectangle" function, which in turn evaluated the probe function at different points.

By the way, Gaussian quadrature is a better numerical integrator than the rectangle method, where "better" means "less error for the same number of evaluations". Gaussian quadrature has it's own problems. See elsewhere for the details. Here's the 4 point GQ integrator for the range -1 to 1.

def integrate_gq(f):
    return (0.652145155*(f(+0.339981044) + f(-0.339981044)) +
            0.347854845*(f(+0.861136312) + f(-0.861136312)))

Using it on the probe function yields a 1E-7 error

>>> integrate_gq(probe_function)
1.6829416883812192
>>>

For demonstration purposes I'll rewrite the above integrators in C. In reality you likely shouldn't do that as there are well-developed general-purpose libraries for numerical integration based on decades of experience and with tunable parameters like "maximum error" which the above two don't have.

I'll start with the header file titled "integrators.h"

#ifndef DALKE_INTEGRATORS_H
#define DALKE_INTEGRATORS_H

/* Two simple numerical integrators */

typedef double (*probe_func_t)(double);

double integrate_gq(probe_func_t f);
double integrate_rectangle(probe_func_t f);

#endif

The tricky part is the "probe_func_t". The integrators need a probe function which takes a double and returns a double. The typedef declares the function pointer type for that case. The DALKE_INTEGRATORS_H is the usual guard in a C header file to make sure the body isn't compiled twice.

The integratores are in the file "integrators.c"

/* Two numerical integrators */
#include "integrators.h"

/* 4 point Gaussian quadrature integration */
double integrate_gq(sample_func f) {
	return 
  0.652145155*(f(+0.339981044) + f(-0.339981044)) +
  0.347854845*(f(+0.861136312) + f(-0.861136312));	
}

/* 4 point rectangle rule */
double integrate_rectangle(sample_func f) {
	return (f(-0.75)+f(-0.25)+f(0.25)+f(0.75))/2;
}

Different OSes have different ways to create shared libraries. Luckily for us Python's setup.py handles those cases for us. I can use it to generate a shared library even if the resulting file is not a Python extension library. Here's my setup.py:

from distutils.core import setup, Extension

setup(name="integrators", version="0.0",
	ext_modules = [Extension("integrators", ["integrators.c"])])

and here's me using it

% python setup.py build
running build
running build_ext
building 'integrators' extension
creating build
creating build/temp.macosx-10.3-ppc-2.6
gcc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I/Users/dalke/cvses/python-svn/Include -I/Users/dalke/cvses/python-svn -c integrators.c -o build/temp.macosx-10.3-ppc-2.6/integrators.o
creating build/lib.macosx-10.3-ppc-2.6
gcc -bundle -undefined dynamic_lookup build/temp.macosx-10.3-ppc-2.6/integrators.o -o build/lib.macosx-10.3-ppc-2.6/integrators.so

The resulting shared library was placed in build/lib.macosx-10.3-ppc-2.6 . To simplify testing I made a symbolic link from there to my working directory

ln -s build/lib.macosx-10.3-ppc-2.6/integrators.so .

I'll see if I can load the function and get the integrate_rectangle function

>>> import ctypes
>>> ctypes.CDLL("integrators.so")
<CDLL 'integrators.so', handle 620cd0 at 540d10>
>>> integrators = _
>>> integrators.integrate_rectangle
<_FuncPtr object at 0x39cce0>
>>>

Success! Or the first step on the path to success. Watch as I make a simple probe function and fail to pass it in to the integrator

>>> def flat_probe(x):
...   return 1.0
... 
>>> integrators.integrate_rectangle(flat_probe)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ctypes.ArgumentError: argument 1: <type 'exceptions.TypeError'>: Don't know how to convert parameter 1
>>>

I need to tell ctypes how to call my probe function. I'll create a new data type telling ctypes the required input and output types. This is the equivalent of the probe_func_t from earlier

>>> PROBE_FUNC = ctypes.CFUNCTYPE(ctypes.c_double, ctypes.c_double)
>>>

The first argument is the return types and the remaining arguments are the argument types. See the ctypes documentation section "callback functions" for an example that doesn't use the same type in both places.

I'll annotate the function with descriptions of the argument and result types then pass in the flat_probe function. Even though the argtype says the function is a PROBE_FUNC it looks like I have to convert the Python function to a CFunctionType object using PROBE_FUNC(flat_probe).

>>> integrators.integrate_rectangle.argtypes = [PROBE_FUNC]
>>> integrators.integrate_rectangle.restype = ctypes.c_double
>>> integrators.integrate_rectangle(PROBE_FUNC(flat_probe))                   
2.0
>>> 
>>> integrators.integrate_gq.argtypes=[PROBE_FUNC]
>>> integrators.integrate_gq.restype = ctypes.c_double
>>> integrators.integrate_gq(PROBE_FUNC(flat_probe))                   
2.0
>>>

I'll numerically integrate the more complicated example from earlier

>>> import math
>>> def probe_function(x):
...   return math.cos(x)*(x+1)
... 
>>> integrators.integrate_gq(PROBE_FUNC(probe_function))
1.6829416883812192
>>> integrators.integrate_rectangle(PROBE_FUNC(probe_function))
1.7006012905844656
>>>

The earlier Python code gave identical results.