[Dbix-class] memory leak - found

Brandon Black blblack at gmail.com
Wed Mar 15 19:58:56 CET 2006


On 3/15/06, Alan Humphrey <alan.humphrey at comcast.net> wrote:
>
>
>
> Found it.  There's good news and there's bad news.  The good news: most
> people won't be affected by the leak.   The bad news: it's not just me.
>
>
>
> The problem is in DBI.pm, specifically in the sth method:
>
>
>
> sub sth {
>
>   my ($self, $sql) = @_;
>
>   # 3 is the if_active parameter which avoids active sth re-use
>
>   return $self->dbh->prepare_cached($sql, {}, 3);
>
> }
>
>
>
> The prepare_cached call means that the statement handle is saved in
> $self->dbh for that particular sql statement.  If the statement is used
> again the existing statement handle is returned from the cache.  Using the
> cache is where the leak occurs.  It isn't cleared until dbh is destroyed.
>
>
>
> So why was I affected and most people won't be?  I'm using the FreeTDS
> library for low level access to the database.  That library doesn't support
> bind variables, so I subclassed DBI.pm to override the _execute method so I
> could build sql statements without bind variables.
>
>
>
> That's all well and good, but it meant that the $sql passed to sth was
> always different.  There were no hits in the cache so a new statement handle
> was created for each statement.  The solution: override sth and use
> dbh->prepare instead of dbh->prepare_cached:
>
>
>
> sub sth {
>
>   my ($self, $sql) = @_;
>
>   return $self->dbh->prepare( $sql, {} );
>
> }
>
>
>
> Bottom line: there is a leak but as long as your driver supports bind
> variables you are not likely to be affected.
>
>

This can be a general DBI-level problem even in the case of drivers
that do support bind variables, if you have an endless and
everchanging supply of table names (and probably other esoteric
situations as well).

I converted to using prepare_cached instead of prepare in several key
spots in some of my straight-DBI code, only to end up with slow
long-term memory leaks.  Because my table names continually vary
slowly over time (every several thousands hits of a query to a given
table name, the table name would change again), I was still getting an
extremely high hit rate, but I was also getting a continual (if
somewhat slower) rate of fresh new sql statements in the cache that is
neverending.

It would be nice if DBI and/or Postgres (postgres is involved in this
as well, since ->prepare_cached on recent postgres caches the prepped
statement server-side) had an option to limit the total number of
cached/prepared statements to X based on LRU (or even better, based on
least savings in terms of how often the sth is used * the prep costs
it saved), but that's pie in the sky for now.  In the meantime I just
have my daemons disconnect and reconnect to the db every so often to
flush it all out.  It might be even smarter to have them do their own
smart sth reaping on $dbh->{CachedKids}.

It's not as much of an issue with "normal" DBIC usage, as the
statements should map at worst 1:1 with the various DBIC method calls
you see in your app source code, which are finite.

-- Brandon



More information about the Dbix-class mailing list