[Dbix-class] Fwd: Re: DBD::Multiplex - do NOT use

Sat Sep 9 09:03:55 CEST 2006

Delivered-To: mailing list dbi-dev at perl.org
Delivered-To: dbi-dev at perl.org
Reply-To: henri at shopzilla.com
To: "Tim Bunce" <Tim.Bunce at pobox.com>, dbix-class at lists.rawmode.org
Cc: tkishel+perl at gmail.com, dbi-dev at perl.org,
         "Tomas Doran" <bobtfish at bobtfish.net>
Subject: Re: [Dbix-class] DBD::Multiplex - do NOT use
From: henri at shopzilla.com
Date: Sat, 9 Sep 2006 06:51:44 +0000

I'm the author of DBIx::HA.
(And I'm not on the dbix-class mailing list, so this may bounce from it).

I went ahead and developed DBIx::HA because:
- I needed a high availability module that understood primary master 
and replicated databases
- I wanted to be able to distribute load on read-only databases
- I needed DBI to play nice with Sybase's OpenSwitch product
- I needed it to work well with Apache::DBI
- and DBD::Multiplex seemed good at write-to-many, read-from-one 
while I wanted write-to-one, read-from-many.

At this point, DBIx::HA is at version 0.95 on CPAN but I internally 
am testing 0.98. If someone wants it asap, I'll put it on CPAN as it 
has a few necessary improvements.

DBIx::HA is a relatively simple concept:

You specify a stack of database dsns. If upon a connect, prepare or 
execute there's a failure, it either tries again the same stack entry 
or goes down the stack.

The interesting stuff is that you can specify connect or execute 
timeouts that work in all cases (thanks to Sys::SigAction), you can 
set up a round robin on the stack, and in the case of working under 
Apache::DBI, you can have all your processes fail over at once if one 
process fails over (so you always only have one write master at any 
one time).
You can also easily force a failover, and the system never reverts 
automatically back to a db higher up on the stack when it goes back 
online.

The main drawback of this approach is that I do nothing at the fetch 
level. So if a write statement fails in the middle of a fetchrow, you 
have a problem. Unless of course you're using transactions 
everywhere, which is what you should do anyway :)

I recommend this module to those who use database replication on the 
backend and want to be able to take advantage of the 
primary/backup(s) system that this replication brings them.

And finally, it was my intention to write this module only as a 
stopgap until DBD::Multiplex was up to snuff. :)
H

--
Henri Asseily
henri at shopzilla.com
CTO, Shopzilla

-----Original Message-----
From: Tim Bunce <Tim.Bunce at pobox.com>
Date: Fri, 8 Sep 2006 23:39:34
To:dbix-class at lists.rawmode.org
Cc:tkishel+perl at gmail.com, dbi-dev at perl.org,       Tomas Doran 
<bobtfish at bobtfish.net>
Subject: Re: [Dbix-class] DBD::Multiplex - do NOT use

On Fri, Sep 08, 2006 at 02:24:52PM -0400, Rob Kinyon wrote:
>  mst asked me to write up a justification as to why DBD::Multiplex
>  should be completely disregarded when developing the master/slave
>  round-robin code.
>
>  I was asked to look at added transactional capabilities to
>  DBD::Multiplex when dealing with master-slave situations. When I dove
>  in, I found a complete mess. Basically, the structure is as follows:
>
>  1) You create a set of $dbh's and designate one as the master.
>  2) It kinda does the right thing when dealing with prepare().
>  3) The multiplexing code is identical regardless of whether you're
>  multiplexing over a drh, dbh, or sth.
>  4) There is no round-robining - it's just broadcast, broadcast to
>  masters, or broadcast to slaves.
>  5) There are no tests. I started writing some tests and found bugs
>  everywhere I looked.
>  6) There are no comments.
>
>  #3 is the biggest issue - $dbh's and $sth's are treated identically
>  when they cannot be.

In the typical use-cases for DBD::Multiplex the $dbh would have multiple
underlying handles, but the prepare would return an $sth that only had one.
The prepare() picks which underlying dbh(s) should be used, and typically
it's just one - the multiplexing code isn't relevant in that situation.

>  Frankly, it would be faster to just create a new DBD::Multiplex from
>  scratch, taking lessons learned, than to attempt to clean up the
>  current DBD::Multiplex.

The DBD::Multiplex on cpan has changed very little since 2004.

I have a version that's rather more flexible in many ways but, sadly,
the project it was being developed for got cancelled and the existing
work wasn't polished/documented/tested enough to release.

You can find it under http://svn.perl.org/modules/dbi/trunk/lib/DBD
It needs more work.

On Fri, Sep 08, 2006 at 02:44:35PM -0400, Drew Taylor wrote:
>
>  Crap. Not good. Interesting though that Tim Bunce mentioned it in
>  OSCON talks this summer, so I assumed it was a goodo bet. :-(

I think it said it was in need of some love and asked for volunteers :)

>  I'm in the process of adding replication to a mysql installation (RT
>  3.4.x) and was planning to use DBD::Multiplex. Do you have any
>  suggestions for alternatives in the meantime? The application is setup
>  such that I think it would be a royal PITA to hack the functionality
>  into RT.

I believe DBIxHA is one of the more advanced high-availability modules
for the DBI. (I think it's the only one that uses swap_internal_handle,
for example.) It's also designed to be very fast, but it looks like it's
not designed to separate reads and writes.

There's also DBD::Multi by Casey West and Dan Wright which says it has
some "EXPERIMENTAL" support for read/write but the docs are sparse:
http://search.cpan.org/~dwright/DBD-Multi/lib/DBD/Multi.pm

On Fri, Sep 08, 2006 at 02:54:16PM -0400, Rob Kinyon wrote:
>
>  I don't have any good suggestions in terms of packaged solutions.
>
>  One thing to keep in mind is that you usually have a really good idea
>  of when you're doing a mutate and when you're not. So, you could just
>  defer your DBH creation until you know whether you're mutate-free or not.
>
>  Now, hacking that into RT will be .... interesting. And, some
>  applications ALWAYS do a mutate (for logging or auditing purposes), so
>  this sort of round-robin architecture will never work. All the
>  applications I write tend to fall into that category. I don't know 
>if RT does.

Seems like there are two basic choices: either some "smart" driver
tries to hide the fact there are separate read and write dbh's,
or the application (or ORM-type-layer) explicitly maintains separate
handles.

Smart drivers add overhead and can't always tell if a statement mutates
data (ie stored procedures) - but they can be very useful in places.

ORMs are in the fortunate position of knowing if they're issuing a
mutating statement (in general) and so can pick which handle to use.

On Fri, Sep 08, 2006 at 11:56:38AM -0700, J. Shirley wrote:
>  Right now I'm also looking into setting up multiplexing in DBIC.  I
>  think that given the problems with DBD::Multiplex it would be prudent
>  to start from scratch.  I'll chip in on this as much as I can,
>  although others may have more time to dedicate towards it so I can
>  either write some test cases and docs, or help out on the less
>  core-specific features.

On Fri, Sep 08, 2006 at 08:47:47PM +0100, Matt S Trout wrote:
>
>  Maybe it'd be worthwhile writing some simpler, more lightweight DBDs along
>  this line - maybe a DBD::HA (failover) and a DBD::LoadBalance (for 
>round robin
>  etc.) that Storage could then use?

Writing a 'transparent' DBD layered over another is non-trivial - see
assorted comments in DBD::Multiplex.

Unless there's a very good reason not to I'd suggest that DBIx::Class
add the concept of a read-only db connection that defaults to the main
db connection. Then use that read-only db connection for all read-only
db actions.

(You need to give consideration to how that works with transactions -
but I'm pretty sure those issues will be much easier to address at the ORM
level than they would be if you tried to put the smarts into the DBD layer.)

Tim.