[Dbix-class] REPOST: memcache / DBIx::Class

Tue May 29 22:06:31 GMT 2007

On 5/29/07, Matt S Trout <dbix-class at trout.me.uk> wrote:
>
> I'm starting to think Oleg and I were possibly -both- right :)
>
> I think we've got more than one use case here -
>
> (1) caching of specific queries
>
> (2) general resultset query caching
>
> (3) specific find-by-identity (PK/UK/etc.) caching
>
> I'm fairly sure (3) probably wants to happen at the resultsource level
> so it's shared across all resultsets (or via storage dealing with the $source
> it's passed, if we moved find() to using select_single ... but I think I like
> the source doing it so it can re-cache by any other identities when it happens).
>

I think a lot of the ideas/posts on this subject are ignoring cache
invalidation.  If we cache a result or the realization of a resultset,
it would be highly inconsistent of us to continue serving stale cached
data after someone issued a DBIC statement that would have altered
that information.

Proper cache invalidation is the real thorny thing that makes all of
the generic approaches difficult to impossible.  (3) I think is the
lowest-hanging fruit.

If we have a real PK/UNIQUE column to key on, we can probably reliably
invalidate the cache on ins/upd/del of that key, which makes for a
consistent cache model within that limited usage.  We could probably
do special cases for "SELECT * FROM foo" too, and just invalidate on
any ins/upd/del on anything in the table.  This is a very simplistic
but useful caching model, where the PK/UK value is the key, and the
object's data is the data.  You could probably even extend it to cache
objects returned by a complex search by their PKs as they are fetched
via the ->next iterator too.

An even bigger issue with this is that there probably isn't a general
solution that also covers update/delete with complex where clauses, as
opposed to update/delete of a specific key. (As in, "UPDATE people SET
account_type=30 WHERE person_id > 200", or "DELETE FROM people WHERE
person_id > 200").

The easiest approach is, of course, to cache aggressively (even
->search() based on serialized search args), and blow the whole cache
on any ins/upd/del of anything in the table.  This solves all
consistency issues, but is far less than ideal efficiency-wise.

The other options (1) and (2) are even harder to solve the cache
invalidation issues on.  One (probably unrealistic) solution would be
to actually understand WHERE clauses in Perl, so that we can properly
use them to grep through cache entries.

-- Brandon