Dalke Scientific Software: More science. Less time. Products
[ previous | newer ]     /home/writings/diary/archive/2006/03/19/class_instantiation_performance

Class instantiation performance

I'm writing a GFF3 file parser. I have a class which looks like

class GFF3Feature(object):
    def __init__(self, seqid, source, type, start, end, score, strand,
                 phase, attributes):
        self.seqid = seqid
        self.source = source
        self.type = type
        self.start = start
        self.end = end
        self.score = score
        self.strand = strand
        self.phase = phase
        self.attributes = attributes
I need to create a lot of these. What's the fastest way to instantiate them? I know a few ways. My contraint is the normal (public) interface must have a constructor interface which requires all parameters and does not accept other parameters. Here are the ones I tested:
# The normal way
class Feature1(object):
    def __init__(self, seqid, source, type, start, end, score, strand,
                 phase, attributes):
        self.seqid = seqid
        self.source = source
        self.type = type
        self.start = start
        self.end = end
        self.score = score
        self.strand = strand
        self.phase = phase
        self.attributes = attributes

# Copy from locals.  Learned this one from a posting by Alex Martelli.
class Feature2(object):
    def __init__(self, seqid, source, type, start, end, score, strand,
                 phase, attributes):
        self.__dict__.update(locals())
        del self.self


# Bypass normal creation and assume the input is correct
class Feature3(object):
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)
        self.__class__ = Feature1

# __slots__ might give better performance
class Feature4(object):
    __slots__ = ["seqid", "source", "type", "start", "end", "score",
                 "strand", "phase", "attributes"]
    def __init__(self, seqid, source, type, start, end, score, strand,
                 phase, attributes):
        self.seqid = seqid
        self.source = source
        self.type = type
        self.start = start
        self.end = end
        self.score = score
        self.strand = strand
        self.phase = phase
        self.attributes = attributes
    
A bit of timing harness
import time, itertools

# average and stddev
def stats(data):
    sum = 0.0
    sum2 = 0.0
    for x in data:
        sum += x
        sum2 += x*x
    N = len(data)
    return sum/N, ((sum2-sum*sum/N)/(N-1))**0.5


def create(cls, x):
    for _ in x:
        a = cls(seqid="qwe", source="wert", type="exon", start=100,
                end=1000, score=0.1, strand="+", phase=0, attributes={})
    return a

def main():
    clses = (Feature1, Feature2, Feature3, Feature4)
    tot_times = [[] for cls in clses]
    x = range(40000)
    for i in range(7):
        for cls, tot_time in zip(clses, tot_times):
            t1 = time.time()
            a = create(cls, x)
            t2 = time.time()
            assert a.end == 1000
            tot_time.append(t2-t1)
    for cls, tot_time in zip(clses, tot_times):
        avg, stddev = stats(tot_time)
        tot_time.sort()
        print cls.__name__, " min=", min(tot_time), "avg=", avg, "stddev=", stddev

if __name__=="__main__":
    main()
The results
Feature1  min= 1.03674292564 avg= 1.09673094749 stddev= 0.0599208818078
Feature2  min= 1.18326091766 avg= 1.28605253356 stddev= 0.13197632388
Feature3  min= 0.890069961548 avg= 0.955237865448 stddev= 0.0523913440787
Feature4  min= 0.908952951431 avg= 0.954915148871 stddev= 0.0310191453362
Looks like my best solution is Feature3, which uses a private interface to build the class, then do an in-place change to the public interface.

That would be wrong though. Python classes have extra overhead with keyword arguments vs. positional ones. I'll change the code a bit to redo the Feature1, Feature2 and Feature4 tests with positional arguments. (The Feature3 approach requires keyword arguments.)

def create(cls, x):
    for _ in x:
        a = cls("qwe", "wert", "exon", 100,
                1000, 0.1, "+", 0, {})
    return a
The times are:
Feature1  min= 0.485749006271 avg= 0.498355899538 stddev= 0.00987560013671
Feature2  min= 0.613602876663 avg= 0.632926872798 stddev= 0.0336558923515
Feature4  min= 0.344656944275 avg= 0.351382800511 stddev= 0.00944194461424
Calling Python with positional parameters is faster (by a factor of 2!) than with keyword arguments.

The fastest approach uses __slots__. Guido once wrote:

__slots__ is a terrible hack with nasty, hard-to-fathom side effects that should only be used by programmers at grandmaster and wizard levels. Unfortunately it has gained an enormous undeserved popularity amongst the novices and apprentices, who should know better than to use this magic incantation casually.
Am I a grandmaster? I've never written a metaclass. My view has been to shy away from cute code and stay with obviously readable code.

There is very little in the way of guidance for when to use __slots__. There is guidance of when not to use it, eg, to imply that a class is final/closed and catch accidental typos in instance assignment. Here's another quote, this from Alex Martelli:

__slots__ is, essentially, an optimization (in terms of memory consumption) intended for classes that are likely to be created in very large numbers: you give up the normal flexibility of being able to add arbitrary attributes to an instance, and what you gain in return is that you save one dictionary per instance (dozens and dozens of bytes per instance -- that will, of course, not be all that relevant unless the number of instances that are around is at the very least in the tens of thousands).
Alexander Schmolck said:
I'd recommend against using __slots__ (unless really needed for optimization), because it cripples the usefulness and generality of your classes to your and also *to other people's* code (weakrefs, pickling, reflection -- all compromised). The first two you could fix (but are unlikely to, unless you *yourself* need to use weakrefs or pickling of instances of those classes), the last one you can't.
The fact that someone's code using __slots__ makes other peoples code less useful (because things that work fine for instances without slots suddenly screw up) makes me think introducing __slots__ was a rather bad idea to start with.
Going back to Alex:
So my advice remains: consider __slots__ for classes which may need to be instantiated a HUGE number of times, if the restrictions that __slots__ imposes can be lived with. Like for all optimizations, you'll probably want to measure things, at some level, BEFORE you optimize.

My view is that this is a simple data container class. There is no reason anyone will inherit from it. It's more work to inherit from this class than it is to make a new class with the same duck-type interface, such that no one can tell the difference. The negative problems of __slots__ should not arise. The 30% performance boost, at least in micro timings, is significant. I implemented the code and using __slots__ speeds up my overall timings by about 5%.

I think in this case it's worthwhile.


Andrew Dalke is an independent consultant focusing on software development for computational chemistry and biology. Need contract programming, help, or training? Contact me



Copyright © 2001-2013 Andrew Dalke Scientific AB