[Dbix-class] DBIx::Class::Storage::DBI::Replicated - read from master

Wed Jun 9 20:15:03 GMT 2010

>
>From: Bill Moseley <moseley at hank.org>
>To: dbix-class <dbix-class at lists.scsys.co.uk>
>Sent: Tue, June 8, 2010 12:31:16 AM
>Subject: [Dbix-class] DBIx::Class::Storage::DBI::Replicated - read from master
>
>
>Sanity check.
>
>
>I need to have DBIC work with an existing database that includes slaves.  As the Replicated docs note, there's often a replication delay from master to slaves.  So, when a write happens I want to direct all reads to the master for some period of time (for the specific user that did the write).  In addition, I need a way to communicate to other applications/processes that they also need to use the master for any requests by that user.

Interesting idea.  The way DBIC Replication works is that by default when a slave falls too far behind it gets dropped from the pool until it catches up.  Anything wrapped inside a transaction automatically does both reads and writes to the master.  I didn't think about a strategy where after an insert access to slaves would be temporarily suspended.  Seems a bit heavy handed to me, but I guess that could be grafted in.

>
>
>The communication between applications is done via memcached.
>
>
>I'm using Postgresql and Slony.  I have not looked into implementing lag_behind_master (as I'm not clear how that works in ::Replicated).
>

The current replication only works with MySQL native replication.

If you look inside DBIC::Storage::DBI::mysql.pm you'll see two methods, "is_replicating" and "lag_behind_master" which is probably pretty specific to MySQL.  When I originally wrote this stuff I had little to no idea how replication works under other databases so was at a loss really to figure out how to properly abstract this.  'lag_behind_master' is a numeric value that is supposed to be the number of seconds a given slave is behind the master.  In practice the number is pretty relative.  In DBIC replication support  this number is used to determined if a slave is 'too stale' to use.

>
>
>The existing (non-DBIC) application will set a flag in memcached when a write happens.  This is keyed by user id. And each request memcached is checked to see if the current user needs to read from the master.  I'm looking at a way to duplicate that behavior with ::Replicated.
>

Well, you can flip replication off on a per query basis:

my $RS = $schema->resultset('Source')->search(undef, {force_pool=>'master'});

I'd probably hack into my Catalyst model to add that bit to search resultset automatically.  

>I'd like to know if this seems like a reasonable approach, and if anyone sees any gotchas that I need to be aware of.
>
>
>First, I subclass ::Replicated in Catalyst::Model::DBIC::Schema config via storage_type => 'MyApp::DB::Replicated'.  The point of this subclass is two things: 1) set a flag in memcached and 2) force all reads to the master for the remainder of the request.
>
>
>This subclass looks like:
>
>
>package MyApp::DB::Replicated;
>>use Moose;
>>extends 'DBIx::Class::Storage::DBI::Replicated';
>>use namespace::autoclean;
>>
>>
>>has flag_write => ( is => 'rw' );
>>
>>
>>
>>
>>my @methods = qw/
>>    insert
>>    insert_bulk
>>    update
>>    delete
>>/;
>>
>>
>>after \@methods => sub {
>>    my $self = shift;
>>    $self->flag_write->();
>>    $self->set_reliable_storage;
>>};
>>
>>
>>__PACKAGE__->meta->make_immutable;
>>1;
>
>
>So, after the methods listed a sub is called to flag (in memcached) that a write happened, and then set_reliable_storage is forced on to make any subsequent reads go to the master (for the remainder of the request).
>
>
>Replicated will force all reads to the master for reads inside a transaction, but a single request might span multiple transactions (and selects outside of a txn_do), so forcing it for the reminder of the request seems the best option.
>
>
>Now, in the Catalyst Model I need a way to force reads to the master, and also set memcached when a write to the master happens:
>
>
>before 'ACCEPT_CONTEXT' => sub {
>    my ( $self, $c ) = @_;
>
>
>    my $schema  = $self->schema;
>    my $storage = $schema->storage;
>
>
>
>
>    return if $c->stash->{_replicated_set}++
>    || !$storage->isa( 'DBIx::Class::Storage::DBI::Replicated' );
>
>
>    # callback to flag that reads go to master.
>    $storage->flag_write( sub { $self->force_master($c, 1) } );
>
>
>    # Should all reads go to master?
>>    if ( $self->force_master( $c ) ) {
>        $storage->set_reliable_storage;
>    }
>    else {
>        $storage->set_balanced_storage;
>    }
>
>
>};
>
>
>That doesn't feel bullet proof by any means, but does this seem like a good way to hook into DBIC for this?
>
>

Seems there's a bit more here than you need.  Like I mentioned above you got that force_pool=>'master' thing.  Honestly I'd probably start by adding that force_pool stuff to my controllers and make sure it works and then back it into the model.  I'm not so excited by the "set_reliable_storage" and its counterpart since the way it flips state is a bit shaky in my mind.  That was my first attempt at giving the user this kind of control and I left it in for backcompat.  You should also take a look at the test for replication (in the t directory) and make heavy used of DBIC_TRACE=1 to see what is going on.  I override this output so you can see what slaves (or the master) are picking up which bit.

I'm very excited to see that postgresql replication of some sort supported.  I only did mysql since $work was paying for just that.  I was also thinking about about a cross database replication support system for DBIC, but that's really a lot of work to get right.

>
>
>BTW -- Do you think ::Replicated should load any class specified by "storage_type"?  I'm having to explicitly "use" my subclass.

There was a reason for this but I can't recall.  Anybody else remember?

>
>
>Thanks,
>
>
>
>-- 
>Bill Moseley
>moseley at hank.org
>