Automatic Pickle Serialization and Deserialization with PostgreSQL
Posted by troy Tue, 13 Jun 2006 04:00:00 GMT
Building on the earlier success of creating a pickle data type in PostgreSQL, I've come up with this small and comfortable way of automatically serializing and de-serializing python pickles with PostgreSQL and psycopg2.
First, a quick recap of the PG data-type-creation-by-cloning method. As the database super user, issue a command like this to create a copy of the text type named pickle:
INSERT INTO pg_catalog.pg_type
SELECT 'pickle' AS typname,
typnamespace,
typowner,
typlen,
typbyval,
typtype,
typisdefined,
typdelim,
typrelid,
typelem,
typinput,
typoutput,
typreceive,
typsend,
typanalyze,
typalign,
typstorage,
typnotnull,
typbasetype,
typtypmod,
typndims,
typdefaultbin,
typdefault
FROM pg_catalog.pg_type
WHERE typname = 'text';We have to create a new type so that we have something unique and specific to register with the extension. Now create a test table using the new type like this:
CREATE TABLE employee (
employee SERIAL PRIMARY KEY,
fullname TEXT,
photo PICKLE
);Both of these commands are in the attachment pickletest.sql.
We also need a python module and class for pickling. Put a file named emp.py somewhere in your python search path, and fill it like so:
#!/usr/bin/env python
from pgpickle import conform_method, quoted_pickle_method
class EmployeePhoto:
def __init__(self, content_type, content):
self.content_type = content_type
self.content = content
__conform__ = conform_method()
getquoted = quoted_pickle_method()That's it! Two simple statements and EmployeePhoto instances are ready for automatic pickling on INSERT and UPDATE.
A couple of notes about this EmployeePhoto class are in order. First, the conform and getquoted methods are used by psycopg2 to adapt instances as they're inserted into the database. Second, I've coded them here as closures in the pgpickle module (see below) because I think the approach is simpler than messing with class inheritance -- having them as separate constructs allows for mix and match as appropriate, and as the GoF admonishes "favor object composition over inheritance." There's nothing stopping you from restating these two methods as a mix-in class, but I'll take my closures where I can get them.
Also in your python path, create the pgpickle.py module, and make its contents match this:
#!/usr/bin/env python
from cPickle import dumps, loads
from base64 import b64encode, b64decode
import psycopg2
import psycopg2.extensions as extensions
import psycopg2._psycopg as _psycopg2
def conform_method(match=extensions.ISQLQuote):
def inner(self, proto):
if proto == match:
return self
return inner
def quoted_pickle_method(protocol=-1):
def inner(self):
return "'%s'" % b64encode(dumps(self, protocol))
return inner
def load_pickle(value, cursor):
if value is not None:
return loads(b64decode(value))
def register_data_type(connection, pg_type_name, type_name, loader):
sql = "SELECT %s::regtype::oid"
cursor = connection.cursor()
cursor.execute(sql, (pg_type_name, ))
type_oids = cursor.fetchone()
type_oid = type_oids[0]
if type_oid not in extensions.string_types:
new_type = extensions.new_type(type_oids, type_name, loader)
extensions.register_type(new_type)
return type_oidLast but not least, we need a third python script to bring this all together. This would normally be your application script, but we're just testing here, so let's call this one addemp.py. Start it off like this:
#!/usr/bin/env python
import urllib2
import psycopg2
import emp
import pgpickle
Then add a main script check and a few lines to fetch an image from the net and create an EmployeePhoto instance:
if __name__ == '__main__':
img_url = 'http://static.flickr.com/31/65132743_68b35b799d_m.jpg'
img = urllib2.urlopen(img_url).read()
cartman_photo = emp.EmployeePhoto('image/jpeg', img)Then create a connection, register the pickle type with it, and create a cursor. Adjust the connect call to match your database hostname, username, database name and password:
con = psycopg2.connect('dbname=yourdb host=localhost user=yourname password=none')
pgpickle.register_data_type(con, 'pickle', 'pickle', pgpickle.load_pickle)
cur = con.cursor()Here's where the work starts to pay off. We run a normal insert statement, passing the EmployeePhoto instance as a parameter. No extra fiddling:
sql = "INSERT INTO employee (fullname, photo) VALUES (%s, %s)"
cur.execute(sql, ('Eric Cartman', cartman_photo))
print cur.statusmessageAnd finally, let's pull out the new record and compare it and the instance we get against what we put in:
sql = "SELECT employee, fullname, photo FROM employee WHERE fullname = %s"
cur.execute(sql, ('Eric Cartman', ))
cartman = cur.fetchone()
photo = cartman[2]
print cartman
assert photo.content_type == cartman_photo.content_type
assert photo.content == cartman_photo.contentHere's how I ran this file and what it produced:
# python addemp.py
INSERT 37309 1
(23, 'Eric Cartman', <emp.EmployeePhoto instance at 0xb79f7dcc>)Notice that it took me only 22 attempts to get this recipe correct! Take that, you statically-typed Enterprisy fiends! :)
You should be able to use this technique as-is in your own applications, and of course you're welcome to adapt it to fit your needs. Either way, I hope you enjoy!
Files:
Your Comments.
Other posts about this post.
This post has been discussed on the following web sites / blogs. If you wish to trackback to this post please use the following trackback address: /articles/trackback/9
Spread the word.
Round Rock River supports RSS (Real Simple Syndication), and Trackbacks from other blogs.
Your Reply
Sorry, comments are disabled for this post
looks very clean man!
i usually dumps my object in my python script, then write it to a type text. other way round, after select, i loads it. not very clean, but clumsy and a no brainer. hail to ejaculatio praecox! ;-)