A Tale of PostgreSQL Types and Python
Posted by troy Tue, 06 Jun 2006 04:00:00 GMT
Thinking of integrating user-defined types in PostgreSQL and psycopg2? Here's the path I walked, hopefully you can avoid the same pitfalls.
I was after a PostgreSQL data type for storing Python pickles. This seemed simple enough because the database facility for aliasing a type already exists. So the first thing I tried was a CREATE DOMAIN statement in PG, like so:
CREATE DOMAIN pickle AS bytea;This worked within PG but not within the psycopg2 extension. It seems that psycopg2 reports the underlying data type, not the domain. I've since opened a ticket in their Trac. I hope this will get fixed in psycopg2, because this is still the most reasonable technique.
The next thing I tried was the CREATE TYPE statement (using the second form to create a new base type). This was a pretty obvious thing to do, so I thought, as the input/output/send/receive functions are already defined. Not obvious enough! Seems there's no way to reference the BYTEA functions because they're in the PG executable, not within a shared library.
Undeterred, I turned to pyscopg2 for it's ability to map composite types into python callables. Here's the composite type definition in PG, using the first form of CREATE TYPE:
CREATE TYPE pickle AS data BYTEA;After defining and registering a Python callable to handle the data, everything worked. Here's an example of the Python code:
from cPickle import dumps, loads
from base64 import b64encode, b64decode
import psycopg2, psycopg2.extensions
class PgPickle:
def __init__(self, payload, protocol=-1):
self.payload = payload
self.protocol = protocol
def __conform__(self, proto):
if proto == psycopg2.extensions.ISQLQuote:
return self
def getquoted(self):
return "ROW('%s')" % b64encode(dumps(self.payload, self.protocol))
def load(value, cursor):
if value is not None:
return loads(b64decode(value[1:-1]))
load = staticmethod(load)
def register_pickle_type(cursor):
sql = """
SELECT 'pickle'::regtype::oid
"""
cursor.execute(sql)
type_oids = cursor.fetchone()
pg_pickle = psycopg2.new_type(type_oids, "pickle", PgPickle.load)
psycopg2.register_type(pg_pickle)
return type_oids[0]This didn't work with the BYTEA type, so I changed it to TEXT instead of dealing with the decoding (call me lazy). While that was a minor problem, the larger problem was that the solution wasn't satisfying or natural. Also, please note that the above code isn't tested; it's merely a monkey-c+monkey-v from a test script I had lying around. YMMV.
Still undeterred, I decided to bite the bullet, as they say, and write a custom PG type in C. "Can't be that hard," I told myself. And it wasn't. After rooting through the PG source and the contributed extensions, I came up with this module, pgpickle.c:
#include "postgres.h"
#include "utils/builtins.h"
#include "lib/stringinfo.h"
typedef bytea pickle;
PG_FUNCTION_INFO_V1(pickle_in);
PG_FUNCTION_INFO_V1(pickle_out);
PG_FUNCTION_INFO_V1(pickle_receive);
PG_FUNCTION_INFO_V1(pickle_send);
Datum
pickle_in(PG_FUNCTION_ARGS)
{
// takes string, returns pickle
char *str = PG_GETARG_CSTRING(0);
pickle *result = DatumGetByteaP(DirectFunctionCall1(byteain, CStringGetDatum(str)));
PG_RETURN_BYTEA_P(result);
}
Datum
pickle_out(PG_FUNCTION_ARGS)
{
// takes pickle, returns cstring
char *result;
result = DatumGetCString(DirectFunctionCall1(byteaout, PG_GETARG_DATUM(0)));
PG_RETURN_CSTRING(result);
}And it's associated SQL script, pgpickle.sql:
DROP TYPE pickle CASCADE;
DROP FUNCTION pickle_in(cstring);
DROP FUNCTION pickle_out(pickle);
-- force a reload if needed
LOAD 'pgpickle.so';
CREATE FUNCTION pickle_in(cstring) RETURNS pickle
AS 'pgpickle.so'
LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION pickle_out(pickle) RETURNS cstring
AS 'pgpickle.so'
LANGUAGE C IMMUTABLE STRICT;
CREATE TYPE pickle (
INTERNALLENGTH = 4,
EXTERNALLENGTH = variable,
INPUT = pickle_in,
OUTPUT = pickle_out
);
COMMENT ON FUNCTION pickle_in(cstring)
IS 'function to convert string to pickle';
COMMENT ON FUNCTION pickle_out(pickle)
IS 'function to convert pickle to string';
COMMENT ON TYPE pickle
IS 'nifty bytea synonym';Now, I haven't programmed C seriously since school, and while the above code seemed to work, I didn't want to commit to testing it and maintaining it. So I went back to the Python code. For a day. I can't ever leave something be (OCD? In my house, we call it Orange County Department 'cause the kids are down with it, too, and we don't want to give them a complex. Not yet.) The last approach I tried, the one I'm still using, is a simple and powerful INSERT statement:
INSERT INTO pg_catalog.pg_type (
typname, typnamespace, typowner, typlen,
typbyval, typtype, typisdefined, typdelim,
typrelid, typelem, typinput, typoutput,
typreceive, typsend, typanalyze, typalign,
typstorage, typnotnull, typbasetype, typtypmod,
typndims, typdefaultbin, typdefault
)
VALUES (
'pickle', 11, 1, -1,
'f', 'b', 't', ',',
0, 0, 'byteain', 'byteaout',
'bytearecv', 'byteasend', 0, 'i',
'x', 'f', 0, -1,
0, '', ''
);There's probably a better way to formulate the statement, but I was so elated at it's correct operation that I backed away from the keyboard slowly. And did a little jig of joy. This last approach creates the type correctly in PG, and with a little magic in Python, maps the type correctly to client code. Here's the final Python version:
import psycopg2._psycopg as dbapiext
def register_clone_type(connection, source_name, target_name):
sql = """
SELECT %s::regtype::oid
"""
cursor = connection.cursor()
string_types = dbapiext.string_types
cursor.execute(sql, (source_name, ))
source_oid = cursor.fetchone()[0]
cursor.execute(sql, (target_name, ))
target_oid = cursor.fetchone()[0]
if target_oid not in string_types:
string_types[target_oid] = string_types[source_oid]
return target_oid
register_clone_type(myconnection, 'bytea', 'pickle')Other posts about this post.
This post has been discussed on the following web sites / blogs. If you wish to trackback to this post please use the following trackback address: /articles/trackback/4
Spread the word.
Round Rock River supports RSS (Real Simple Syndication), and Trackbacks from other blogs.
Your Reply
Sorry, comments are disabled for this post