[Dbix-class] DBIx::Class::Storage::DBI::Replicated - read from master

Thu Jun 10 04:30:48 GMT 2010

Hi John,

On Wed, Jun 9, 2010 at 1:15 PM, John Napiorkowski <jjn1056 at yahoo.com> wrote:

>
> Interesting idea.  The way DBIC Replication works is that by default when=
 a
> slave falls too far behind it gets dropped from the pool until it catches
> up.

IIUC, Slony does have a way to report slave lag, but I discussed it briefly
with our DBA and for the existing (non-DBIC dbi-based) master/slave code it
was decided to just use a hard-coded per-user delay.  A user writes to the
db and is pegged to master for some period of time.  There's a separate
process that removes slaves if the lag gets too big.

I would be interested in trying to add lag support, though.
Since DBIx::Class::Storage::DBI::Pg doesn't have methods to report the lag,
I suppose I'd need to create a subclass.  How do I
subclass DBIx::Class::Storage::DBI::Pg?  Is there another way than using a
new dsn (e.g. dbi:PgReplicated) to tell DBIC which DBIC driver to use?

(Just to be clear -- the lag measurement is to determine if a slave should
be pulled out of the balancer pool, correct?  My need to force all selects
to the master for some period of time after a write to the master is a
separate issue.)

I might want to investigate using other means to report the slave lag so
that not every process on every web server is querying this info directly
from each slave -- e.g. maintain a lag list in memcached.

We have also had some talk about using slave load when selecting a slave.
 And another idea we kicked around was to list only one slave and use
Skype's pgbouncer for connection pooling and selection.

>  Anything wrapped inside a transaction automatically does both reads and
> writes to the master.  I didn't think about a strategy where after an ins=
ert
> access to slaves would be temporarily suspended.  Seems a bit heavy handed
> to me, but I guess that could be grafted in.
>

In some cases it's possible that a (web) request may include multiple
transactions (via txn_do)  followed by a select or two that needs the
updated data so I don't think we can expect the slaves to be synced that
fast.  Yes, it's a bit heavy handed to force all selects to the master after
a write.  Much of the app is accessed via and API so it's quite possible
that a read will happen right after a write and expect to fetch current
data.  So, I'm not sure there's much option other than waiting until the
slave have synced.

I guess probably should figure out which queries can always go to the slaves
and which selects need to read from the master after any writes, but that
would be a bit of work for an existing app.  So, it's easier to just force
all.

> >The existing (non-DBIC) application will set a flag in memcached when a
> write happens.  This is keyed by user id. And each request memcached is
> checked to see if the current user needs to read from the master.  I'm
> looking at a way to duplicate that behavior with ::Replicated.
> >
>
> Well, you can flip replication off on a per query basis:
>
> my $RS =3D $schema->resultset('Source')->search(undef,
> {force_pool=3D>'master'});
>
> I'd probably hack into my Catalyst model to add that bit to search
> resultset automatically.
>

By overriding (or "after resultset") in the ResultSet base class?

If I want to do that for an entire life of a request for every query is
there any reason why setting $storage->set_reliable_storage would not
accomplish the same thing?  (Oh, you comment on that below..)

> Seems there's a bit more here than you need.  Like I mentioned above you
> got that force_pool=3D>'master' thing.  Honestly I'd probably start by ad=
ding
> that force_pool stuff to my controllers and make sure it works and then b=
ack
> it into the model.  I'm not so excited by the "set_reliable_storage" and =
its
> counterpart since the way it flips state is a bit shaky in my mind.

Yes, that was one of my first questions when looking at the docs.  There's
an example of cloning the schema and setting it.  I was wondering why there
was a need to clone if I explicitly force it with set_reliable_storage or
set_balanced_storage.

>  That was my first attempt at giving the user this kind of control and I
> left it in for backcompat.  You should also take a look at the test for
> replication (in the t directory) and make heavy used of DBIC_TRACE=3D1 to=
 see
> what is going on.  I override this output so you can see what slaves (or =
the
> master) are picking up which bit.
>

Oh ya, I was concerned about that.  I spent a bit if time checking the
behavior with DBIC_TRACE and I do see the intended behavior.  My concern was
with a mix of transactions the code might be confused about what state it
was in (reliable or balanced).  But, the behavior seems to work as expected
in my tests.

But, I could rewrite to check a flag added to the schema and add force_pool
=3D> 'master' to every resultset.

Thanks very much for the comments.

-- =

Bill Moseley
moseley at hank.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.scsys.co.uk/pipermail/dbix-class/attachments/20100609/ce6=
6d4b8/attachment.htm