2008-08-03

Python shelve speed testing

import shelve, os, time, sha

class Foo:
    def __init__(self):
        self.data = sha.new(repr(time.time())).hexdigest()

nobjects = 100000

try:
    os.unlink('speedtest.db')
except OSError:
    print "creating db file"

start = time.time()

db = shelve.open('speedtest')

i = nobjects
while i > 0:
    db[str(i)] = Foo()
    db.sync()
    i = i - 1

db.close()

t = time.time() - start
print t, " sec for ", nobjects, " objects; ", t/nobjects, " sec/object"

On a 2 GHz MacBook with 1 GB of RAM and Python 2.3.5 the run time for this program is around 7 seconds. ("7.00173902512 sec for 100000 objects; 7.00173902512e-05 sec/object", the AnyDBM backend being "Berkeley DB 1.85 (Hash, version 2, native byte-order)".) Omitting the .sync() call shaves a few hundred milliseconds from that.

See also: http://codeidol.com/python/python3/Databases-and-Persistence/Shelve-Files/