[Dbix-class] DBIx::Class::Storage::DBI::Replicated - read from master

Fri Jun 11 15:35:05 GMT 2010

>
>From: Bill Moseley <moseley at hank.org>
>To: DBIx::Class user and developer list <dbix-class at lists.scsys.co.uk>
>Sent: Thu, June 10, 2010 12:30:48 AM
>Subject: Re: [Dbix-class] DBIx::Class::Storage::DBI::Replicated - read from  master
>
>>Hi John,
>
>
>On Wed, Jun 9, 2010 at 1:15 PM, John Napiorkowski <jjn1056 at yahoo.com> wrote:
>
>>>
>>
>>Interesting idea.  The way DBIC Replication works is that by default when a slave falls too far behind it gets dropped from the pool until it catches up.
>
>
>IIUC, Slony does have a way to report slave lag, but I discussed it briefly with our DBA and for the existing (non-DBIC dbi-based) master/slave code it was decided to just use a hard-coded per-user delay.  A user writes to the db and is pegged to master for some period of time.  There's a separate process that removes slaves if the lag gets too big.
>
>
>I would be interested in trying to add lag support, though. Since DBIx::Class::Storage::DBI::Pg doesn't have methods to report the lag, I suppose I'd need to create a subclass.  How do I subclass DBIx::Class::Storage::DBI::Pg?  Is there another way than using a new dsn (e.g. dbi:PgReplicated) to tell DBIC which DBIC driver to use?
>

I'd probably just patch DBIC:Storage::DBI::Pg to support those two methods.  I know its a bit suspect but right now its the only way.  I think we need to do a few of these before the best way to refactor becomes more clear.

>
>(Just to be clear -- the lag measurement is to determine if a slave should be pulled out of the balancer pool, correct?  My need to force all selects to the master for some period of time after a write to the master is a separate issue.)
>

Yes, we check lag on each slave every so period of time (that is determined by "auto_validate_every") to see if its too laggy or if its recovered and we can put it back into the pool.

>
>I might want to investigate using other means to report the slave lag so that not every process on every web server is querying this info directly from each slave -- e.g. maintain a lag list in memcached.
>

Originally I intended to do this, that way we reduce the number of lag checking, but I found the overhead was pretty low and in the end didn't have time to do that.

>
>We have also had some talk about using slave load when selecting a slave.  And another idea we kicked around was to list only one slave and use Skype's pgbouncer for connection pooling and selection.
>

You could probably write a custom Balancer that would to that, just extend DBIx::Class::Storage::DBI::Replicated::Balancer 

>
> 
> Anything wrapped inside a transaction automatically does both reads and writes to the master.  I didn't think about a strategy where after an insert access to slaves would be temporarily suspended.  Seems a bit heavy handed to me, but I guess that could be grafted in.
>>
>
>
>In some cases it's possible that a (web) request may include multiple transactions (via txn_do)  followed by a select or two that needs the updated data so I don't think we can expect the slaves to be synced that fast.  Yes, it's a bit heavy handed to force all selects to the master after a write.  Much of the app is accessed via and API so it's quite possible that a read will happen right after a write and expect to fetch current data.  So, I'm not sure there's much option other than waiting until the slave have synced.
>
>
>I guess probably should figure out which queries can always go to the slaves and which selects need to read from the master after any writes, but that would be a bit of work for an existing app.  So, it's easier to just force all.
>

I understand, you really need to write the app with replication in mind from the start.  That's unfortunate but true for now.
>
> 
>>>>The existing (non-DBIC) application will set a flag in memcached when a write happens.  This is keyed by user id. And each request memcached is checked to see if the current user needs to read from the master.  I'm looking at a way to duplicate that behavior with ::Replicated.
>>>>
>>

Take a look at DBIx::Class::Storage::DBI::Replicated::Pool and see if you can't just subclass that or something.  Thats the class that decides whats in the pool or not and you could probably graft any custom logic into that.

>>>
>>
>>Well, you can flip replication off on a per query basis:
>>
>>>>my $RS = $schema->resultset('Source')->search(undef, {force_pool=>'master'});
>>
>>>>I'd probably hack into my Catalyst model to add that bit to search resultset automatically.
>>
>
>
>By overriding (or "after resultset") in the ResultSet base class? 
>
>
>If I want to do that for an entire life of a request for every query is there any reason why setting $storage->set_reliable_storage would not accomplish the same thing?  (Oh, you comment on that below..)
>
>
>
>
>
>>Seems there's a bit more here than you need.  Like I mentioned above you got that force_pool=>'master' thing.  Honestly I'd probably start by adding that force_pool stuff to my controllers and make sure it works and then back it into the model.  I'm not so excited by the "set_reliable_storage" and its counterpart since the way it flips state is a bit shaky in my mind.
>
>
>Yes, that was one of my first questions when looking at the docs.  There's an example of cloning the schema and setting it.  I was wondering why there was a need to clone if I explicitly force it with set_reliable_storage or set_balanced_storage.

There's a good reason, just forgot this moment.  I think if you don't clone it you lose the setting or something.  Oh, maybe if you don't clone then all the requests coming to the webserver share a single schema object and then they start flipping the setting on each other.  When you clone then each request has its own schema that won't interfere with anyone else's.

Don't worry about clone, DBIC was designed to clone and its very fast.

>
>
> 
> That was my first attempt at giving the user this kind of control and I left it in for backcompat.  You should also take a look at the test for replication (in the t directory) and make heavy used of DBIC_TRACE=1 to see what is going on.  I override this output so you can see what slaves (or the master) are picking up which bit.
>>
>
>
>Oh ya, I was concerned about that.  I spent a bit if time checking the behavior with DBIC_TRACE and I do see the intended behavior.  My concern was with a mix of transactions the code might be confused about what state it was in (reliable or balanced).  But, the behavior seems to work as expected in my tests.
>

Yeah I always get shocked when I see the whole thing sorta working :)

>
>But, I could rewrite to check a flag added to the schema and add force_pool => 'master' to every resultset.
>

You could do that or as I mentioned maybe you can write a custom pool class that would poll the memcache thing and turn off replicants for you.
>
>
>
>Thanks very much for the comments.  
>
>
> 
>-- 
>Bill Moseley
>moseley at hank.org
>