Dalke Scientific Software: More science. Less time. Products
[ previous | newer ]     /home/writings/diary/archive/2005/04/20/tracing_python_code

Tracing python code

Sometimes it's helpful to watch what Python the low-level details of what Python is doing. With an IDE like Wing IDE or Komodo you can step through events in a GUI. But if you're a command-line sort of person like me you can roll your own trace program using the same run-time hooks used by those IDEs

The hook is set via the sys.settrace() function. This takes a function which is called during any of several events. One is the "line" event. Before the interpreter executes a line it calls the trace function with the current frame and the string "line". (The function gets a third argument which is None; that fields is used by other events.

The frame contains the attribute "f_lineno" which is the line number about to be run. Here is a simple example which only prints the current line number. Note that the trace function returns itself. If it doesn't then Python's trace mechanism interprets that as a request to stop the tracing while inside the given scope.

import sys

def traceit(frame, event, arg):
    if event == "line":
        lineno = frame.f_lineno
        print "line", lineno

    return traceit

def main():
    print "In main"
    for i in range(5):
        print i, i*3
    print "Done."
    

sys.settrace(traceit)
main()
Here is the output
line 11
In main
line 12
line 13
0 0
line 12
line 13
1 3
line 12
line 13
2 6
line 12
line 13
3 9
line 12
line 13
4 12
line 12
line 14
Done.

Stare at the output long enough and you can figure out what's going on but it isn't easy. It's more helpful to also print the line. The linecache module is available for this but it needs to know the filename. In this case I know all of the lines are from "traceit.py" so I'll hard-code that into the program.

import sys
import linecache

def traceit(frame, event, arg):
    if event == "line":
        lineno = frame.f_lineno
        line = linecache.getline("traceit.py", lineno)
        print "line %d: %s" % (lineno, line.rstrip())
    return traceit
Here's the output
line 13:     print "In main"
In main
line 14:     for i in range(5):
line 15:         print i, i*3
0 0
line 14:     for i in range(5):
line 15:         print i, i*3
1 3
line 14:     for i in range(5):
line 15:         print i, i*3
2 6
line 14:     for i in range(5):
line 15:         print i, i*3
3 9
line 14:     for i in range(5):
line 15:         print i, i*3
4 12
line 14:     for i in range(5):
line 16:     print "Done."
Done.
Much more readable. But what if the program uses another file so that the hard-coded traceit.py doesn't work? Instead of saying print i, i*3 I'll use print i, random.randrange(0, 10) to get a random integer in the range 0<=n<10. I chose this because it is implemented in Python. I originally chose random.random() but that's implemented in C so it wasn't interesting. Here's the output:
line 14:     print "In main"
In main
line 15:     for i in range(5):
line 16:         print i, random.randrange(0, 10)
0 line 141: 
line 142: 
line 144: 
line 150: 
line 151: 
line 153: 
line 166: 
8
line 15:     for i in range(5):
line 16:         print i, random.randrange(0, 10)
1 line 141: 
line 142: 
line 144: 
line 150: 
line 151: 
line 153: 
line 166: 
8
line 15:     for i in range(5):
line 16:         print i, random.randrange(0, 10)
2 line 141: 
line 142: 
line 144: 
line 150: 
line 151: 
line 153: 
line 166: 
7
line 15:     for i in range(5):
line 16:         print i, random.randrange(0, 10)
3 line 141: 
line 142: 
line 144: 
line 150: 
line 151: 
line 153: 
line 166: 
8
line 15:     for i in range(5):
line 16:         print i, random.randrange(0, 10)
4 line 141: 
line 142: 
line 144: 
line 150: 
line 151: 
line 153: 
line 166: 
5
line 15:     for i in range(5):
line 17:     print "Done."
Done.
As you can see it isn't very helpful because it gets lines from the wrong file.

The current filename is available from the module's globals dictionary using the special key of __file__. Here's a trace function that only displays the filename and line number.

import sys
import linecache
import random

def traceit(frame, event, arg):
    if event == "line":
        lineno = frame.f_lineno
        filename = frame.f_globals["__file__"]
        print "file %s line %d" % (filename, lineno)
    return traceit


def main():
    print "In main"
    for i in range(5):
        print i, random.randrange(0, 10)
    print "Done."

sys.settrace(traceit)
main()
and the output is
file <stdin> line 14
In main
file <stdin> line 15
file <stdin> line 16
0 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166
2
file <stdin> line 15
file <stdin> line 16
1 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166
3
file <stdin> line 15
file <stdin> line 16
2 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166
0
file <stdin> line 15
file <stdin> line 16
3 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166
2
file <stdin> line 15
file <stdin> line 16
4 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153
file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166
3
file <stdin> line 15
file <stdin> line 17
Done.
The "<stdin>" is because I ran the code from Emacs in python-mode. If I run it from the command-line I get the correct filename.
file traceit.py line 14
In main
file traceit.py line 15
file traceit.py line 16
This means we'll need to treat the main program somewhat carefully. What I'll do is force the correct filename if it's found to be "<stdin>".

If you look at the filenames you'll see the ones from the random module end with ".pyc". This means the code actually came from the precompiled bytecode and not the original Python file. Getting the right file means dropping the final "c"; or "o" for the ".pyo" optimized bytecode files. For this example I don't want to see the full filename so I'll use the module's __name__ instead.

The current version of the code is

import sys
import linecache
import random

def traceit(frame, event, arg):
    if event == "line":
        lineno = frame.f_lineno
        filename = frame.f_globals["__file__"]
        if filename == "<stdin>":
            filename = "traceit.py"
        if (filename.endswith(".pyc") or
            filename.endswith(".pyo")):
            filename = filename[:-1]
        name = frame.f_globals["__name__"]
        line = linecache.getline(filename, lineno)
        print "%s:%s: %s" % (name, lineno, line.rstrip())
    return traceit


def main():
    print "In main"
    for i in range(5):
        print i, random.randrange(0, 10)
    print "Done."

sys.settrace(traceit)
main()
and the output is
__main__:21:     print "In main"
In main
__main__:22:     for i in range(5):
__main__:23:         print i, random.randrange(0, 10)
0 random:141:         istart = int(start)
random:142:         if istart != start:
random:144:         if stop is default:
random:150:         istop = int(stop)
random:151:         if istop != stop:
random:153:         if step == 1 and istart < istop:
random:166:             return int(istart + int(self.random()*(istop - istart)))
3
__main__:22:     for i in range(5):
__main__:23:         print i, random.randrange(0, 10)
1 random:141:         istart = int(start)
random:142:         if istart != start:
random:144:         if stop is default:
random:150:         istop = int(stop)
random:151:         if istop != stop:
random:153:         if step == 1 and istart < istop:
random:166:             return int(istart + int(self.random()*(istop - istart)))
3
__main__:22:     for i in range(5):
__main__:23:         print i, random.randrange(0, 10)
2 random:141:         istart = int(start)
random:142:         if istart != start:
random:144:         if stop is default:
random:150:         istop = int(stop)
random:151:         if istop != stop:
random:153:         if step == 1 and istart < istop:
random:166:             return int(istart + int(self.random()*(istop - istart)))
7
__main__:22:     for i in range(5):
__main__:23:         print i, random.randrange(0, 10)
3 random:141:         istart = int(start)
random:142:         if istart != start:
random:144:         if stop is default:
random:150:         istop = int(stop)
random:151:         if istop != stop:
random:153:         if step == 1 and istart < istop:
random:166:             return int(istart + int(self.random()*(istop - istart)))
7
__main__:22:     for i in range(5):
__main__:23:         print i, random.randrange(0, 10)
4 random:141:         istart = int(start)
random:142:         if istart != start:
random:144:         if stop is default:
random:150:         istop = int(stop)
random:151:         if istop != stop:
random:153:         if step == 1 and istart < istop:
random:166:             return int(istart + int(self.random()*(istop - istart)))
7
__main__:22:     for i in range(5):
__main__:24:     print "Done."
Done.

How is this helpful for debugging? Let me modify the code just a bit

import sys
import linecache

def traceit(frame, event, arg):
    if event == "line":
        lineno = frame.f_lineno
        filename = frame.f_globals["__file__"]
        if (filename.endswith(".pyc") or
            filename.endswith(".pyo")):
            filename = filename[:-1]
        name = frame.f_globals["__name__"]
        line = linecache.getline(filename, lineno)
        print "%s:%s: %s" % (name, lineno, line.rstrip())
    return traceit

Terry Brunck over at Daylight found a problem with my PyDaylight library. At the end of a function after it was finished and Python was doing garbage collection the code seemed to do nothing. Rather it was doing something because the CPU was pegged but we couldn't figure out what that was.

I wrote a trace function like the one above and put the following at the end of the function that had the problem so it would be called just before it finished:

        import sys
        sys.settrace(traceit)
This let me watch what the finalization code did. The problem was tracked down to a dt_dealloc() call meaning the Daylight toolkit was making the big demands on the CPU. That clue helped narrowed down the problem space. It ended up being three different things. Terry's code made heavy use of a PyDaylight function that made Daylight streams but forgot to dealloc them once finished. Normally that's not a problem because the toolkit keeps track of the created streams and deletes them when the molecule is deleted. Except in this case there were about 30,000 streams and the deallocation code wasn't designed to handle that well.


Andrew Dalke is an independent consultant focusing on software development for computational chemistry and biology. Need contract programming, help, or training? Contact me



Copyright © 2001-2013 Andrew Dalke Scientific AB