Tracing python code
Sometimes it's helpful to watch what Python the low-level details of what Python is doing. With an IDE like Wing IDE or Komodo you can step through events in a GUI. But if you're a command-line sort of person like me you can roll your own trace program using the same run-time hooks used by those IDEs
The hook is set via the sys.settrace() function. This takes a function which is called during any of several events. One is the "line" event. Before the interpreter executes a line it calls the trace function with the current frame and the string "line". (The function gets a third argument which is None; that fields is used by other events.
The frame contains the attribute "f_lineno" which is the line number about to be run. Here is a simple example which only prints the current line number. Note that the trace function returns itself. If it doesn't then Python's trace mechanism interprets that as a request to stop the tracing while inside the given scope.
import sys
def traceit(frame, event, arg):
if event == "line":
lineno = frame.f_lineno
print "line", lineno
return traceit
def main():
print "In main"
for i in range(5):
print i, i*3
print "Done."
sys.settrace(traceit)
main()
Here is the output
line 11 In main line 12 line 13 0 0 line 12 line 13 1 3 line 12 line 13 2 6 line 12 line 13 3 9 line 12 line 13 4 12 line 12 line 14 Done.
Stare at the output long enough and you can figure out what's going on but it isn't easy. It's more helpful to also print the line. The linecache module is available for this but it needs to know the filename. In this case I know all of the lines are from "traceit.py" so I'll hard-code that into the program.
import sys
import linecache
def traceit(frame, event, arg):
if event == "line":
lineno = frame.f_lineno
line = linecache.getline("traceit.py", lineno)
print "line %d: %s" % (lineno, line.rstrip())
return traceit
Here's the output
line 13: print "In main" In main line 14: for i in range(5): line 15: print i, i*3 0 0 line 14: for i in range(5): line 15: print i, i*3 1 3 line 14: for i in range(5): line 15: print i, i*3 2 6 line 14: for i in range(5): line 15: print i, i*3 3 9 line 14: for i in range(5): line 15: print i, i*3 4 12 line 14: for i in range(5): line 16: print "Done." Done.Much more readable. But what if the program uses another file so that the hard-coded traceit.py doesn't work? Instead of saying print i, i*3 I'll use print i, random.randrange(0, 10) to get a random integer in the range 0<=n<10. I chose this because it is implemented in Python. I originally chose random.random() but that's implemented in C so it wasn't interesting. Here's the output:
line 14: print "In main" In main line 15: for i in range(5): line 16: print i, random.randrange(0, 10) 0 line 141: line 142: line 144: line 150: line 151: line 153: line 166: 8 line 15: for i in range(5): line 16: print i, random.randrange(0, 10) 1 line 141: line 142: line 144: line 150: line 151: line 153: line 166: 8 line 15: for i in range(5): line 16: print i, random.randrange(0, 10) 2 line 141: line 142: line 144: line 150: line 151: line 153: line 166: 7 line 15: for i in range(5): line 16: print i, random.randrange(0, 10) 3 line 141: line 142: line 144: line 150: line 151: line 153: line 166: 8 line 15: for i in range(5): line 16: print i, random.randrange(0, 10) 4 line 141: line 142: line 144: line 150: line 151: line 153: line 166: 5 line 15: for i in range(5): line 17: print "Done." Done.As you can see it isn't very helpful because it gets lines from the wrong file.
The current filename is available from the module's globals dictionary using the special key of __file__. Here's a trace function that only displays the filename and line number.
import sys
import linecache
import random
def traceit(frame, event, arg):
if event == "line":
lineno = frame.f_lineno
filename = frame.f_globals["__file__"]
print "file %s line %d" % (filename, lineno)
return traceit
def main():
print "In main"
for i in range(5):
print i, random.randrange(0, 10)
print "Done."
sys.settrace(traceit)
main()
and the output is
file <stdin> line 14 In main file <stdin> line 15 file <stdin> line 16 0 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166 2 file <stdin> line 15 file <stdin> line 16 1 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166 3 file <stdin> line 15 file <stdin> line 16 2 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166 0 file <stdin> line 15 file <stdin> line 16 3 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166 2 file <stdin> line 15 file <stdin> line 16 4 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 141 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 142 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 144 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 150 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 151 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 153 file /System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/random.pyc line 166 3 file <stdin> line 15 file <stdin> line 17 Done.The "<stdin>" is because I ran the code from Emacs in python-mode. If I run it from the command-line I get the correct filename.
file traceit.py line 14 In main file traceit.py line 15 file traceit.py line 16This means we'll need to treat the main program somewhat carefully. What I'll do is force the correct filename if it's found to be "<stdin>".
If you look at the filenames you'll see the ones from the random module end with ".pyc". This means the code actually came from the precompiled bytecode and not the original Python file. Getting the right file means dropping the final "c"; or "o" for the ".pyo" optimized bytecode files. For this example I don't want to see the full filename so I'll use the module's __name__ instead.
The current version of the code is
import sys
import linecache
import random
def traceit(frame, event, arg):
if event == "line":
lineno = frame.f_lineno
filename = frame.f_globals["__file__"]
if filename == "<stdin>":
filename = "traceit.py"
if (filename.endswith(".pyc") or
filename.endswith(".pyo")):
filename = filename[:-1]
name = frame.f_globals["__name__"]
line = linecache.getline(filename, lineno)
print "%s:%s: %s" % (name, lineno, line.rstrip())
return traceit
def main():
print "In main"
for i in range(5):
print i, random.randrange(0, 10)
print "Done."
sys.settrace(traceit)
main()
and the output is
__main__:21: print "In main" In main __main__:22: for i in range(5): __main__:23: print i, random.randrange(0, 10) 0 random:141: istart = int(start) random:142: if istart != start: random:144: if stop is default: random:150: istop = int(stop) random:151: if istop != stop: random:153: if step == 1 and istart < istop: random:166: return int(istart + int(self.random()*(istop - istart))) 3 __main__:22: for i in range(5): __main__:23: print i, random.randrange(0, 10) 1 random:141: istart = int(start) random:142: if istart != start: random:144: if stop is default: random:150: istop = int(stop) random:151: if istop != stop: random:153: if step == 1 and istart < istop: random:166: return int(istart + int(self.random()*(istop - istart))) 3 __main__:22: for i in range(5): __main__:23: print i, random.randrange(0, 10) 2 random:141: istart = int(start) random:142: if istart != start: random:144: if stop is default: random:150: istop = int(stop) random:151: if istop != stop: random:153: if step == 1 and istart < istop: random:166: return int(istart + int(self.random()*(istop - istart))) 7 __main__:22: for i in range(5): __main__:23: print i, random.randrange(0, 10) 3 random:141: istart = int(start) random:142: if istart != start: random:144: if stop is default: random:150: istop = int(stop) random:151: if istop != stop: random:153: if step == 1 and istart < istop: random:166: return int(istart + int(self.random()*(istop - istart))) 7 __main__:22: for i in range(5): __main__:23: print i, random.randrange(0, 10) 4 random:141: istart = int(start) random:142: if istart != start: random:144: if stop is default: random:150: istop = int(stop) random:151: if istop != stop: random:153: if step == 1 and istart < istop: random:166: return int(istart + int(self.random()*(istop - istart))) 7 __main__:22: for i in range(5): __main__:24: print "Done." Done.
How is this helpful for debugging? Let me modify the code just a bit
import sys
import linecache
def traceit(frame, event, arg):
if event == "line":
lineno = frame.f_lineno
filename = frame.f_globals["__file__"]
if (filename.endswith(".pyc") or
filename.endswith(".pyo")):
filename = filename[:-1]
name = frame.f_globals["__name__"]
line = linecache.getline(filename, lineno)
print "%s:%s: %s" % (name, lineno, line.rstrip())
return traceit
Terry Brunck over at Daylight found a problem with my PyDaylight
library. At the end of a function after it was finished and Python
was doing garbage collection the code seemed to do nothing. Rather it
was doing something because the CPU was pegged but we couldn't figure
out what that was.
I wrote a trace function like the one above and put the following at the end of the function that had the problem so it would be called just before it finished:
import sys
sys.settrace(traceit)
This let me watch what the finalization code did. The problem was
tracked down to a dt_dealloc() call meaning the Daylight
toolkit was making the big demands on the CPU. That clue helped
narrowed down the problem space. It ended up being three different
things. Terry's code made heavy use of a PyDaylight function that
made Daylight streams but forgot to dealloc them once finished.
Normally that's not a problem because the toolkit keeps track of the
created streams and deletes them when the molecule is deleted. Except
in this case there were about 30,000 streams and the deallocation code
wasn't designed to handle that well.
Andrew Dalke is an independent consultant focusing on software development for computational chemistry and biology. Need contract programming, help, or training? Contact me
Copyright © 2001-2010 Dalke Scientific Software, LLC.


