[Dbix-class] Static / reference data

Sam perl at net153.net
Wed Aug 22 18:39:29 GMT 2018


On 08/22/2018 01:14 PM, Andy Armstrong wrote:
> Hello,
>
> I've got a database of TV and Radio programmes [1] which currently uses DBI and ad-hoc queries. I'm considering migrating it to use DBIC.
>
> The two most important tables are 'programmes' - which contains about 5.5 million TV and Radio programme listings and 'services' which lists 90 different channels on which those programmes have been broadcast over time. There are lots of other tables but those two are hit by all the hot paths through the application.
>
> The programmes table is large enough that there's no point in trying to cache it at query level - generally speaking we're hitting it completely randomly and if, say, a particular programme gets a lot of hits that's taken care of by CDN caching.
>
> The services table is small, almost completely static (i.e. I don't mind a manual restart if it changes) and, in the current implementation, cached entirely in memory.
>
> Anywhere we display a programme we also need to have its service.
>
> programmes -> belongs_to -> services
> services -> has_many -> programmes
>
> Most of the pages listing programmes (search, schedule) can be generated with a single query to programmes. However for each programme I have to look up its service. That's currently very quick because the services are cached in memory.
>
> With DBIC every call to $prog->service->name results in a query against services - so for a page of 20 search results that's 20 additional queries. Even on a single page it's often the case that many of those queries are searching for the same service.
>
> I've played around with DBIx::Class::Cursor::Cached but that doesn't seem to cache belongs_to lookups at all. Here's what I tried:
>
>    my $cache = Cache::FileCache->new( { namespace => 'GenomeCache' } );
>
>    schema->default_resultset_attributes(
>      { cache_object => $cache,
>        cache_for    => 3600
>      }
>    );
>
> It seems to cache my top level search against programmes but not the subsequent queries for $prog->service->name.
>
> What's the state of the art for dealing with such static data? Ideally I'd like an interface that amounts to a switch that I can throw on a per-table basis to say "this is reference data". Obviously I can just implement my own $prog->service that finds the service in a static cache but that feels messy compared to having a DBIC based implementation of services that understands that it's completely in-core cacheable.
>
> Thanks!
>
> [1] https://genome.ch.bbc.co.uk
>

This seems like a simple schema, not sure what caching would buy you 
unless the DB is not a local service on the same box or network (or 
getting several hundred requests per second). Generally, in my apps when 
it comes to more complex queries, I first write the query in raw SQL and 
benchmark it. Then I convert it over to a DBIC resultset (which normally 
means using the prefetch option) and bench mark that. If it much slower 
than the raw SQL query, I might look into only fetching the columns I 
need (via the 'columns => []' property). If that doesn't speed it up 
enough, I'll then switch to using the hashref inflator resultsource 
class which usually brings it right in line raw SQL performance.


--Sam




More information about the DBIx-Class mailing list