[Dbix-class] don't want to hit the db again

Wed Apr 29 11:56:59 GMT 2015

On Tue, Apr 28, 2015 at 03:37:42PM -0700, Kevin Karabian wrote:

> Yes, but the problem is for a has_many, prefetch causes multiple rows to be
> returned.  For example if a result object has 10 related objects then 10
> rows are returned for that one object.  If the result object is large, then
> that is a lot of repetition.

We've come across a similar problem at work too. We had a situation
where there were a few very large rows which each had very many small
related rows. We were carefully using prefetch to make just 1 query:

  -- meadows are big but few, cows are small but many
  SELECT meadows.*, cows.*
    FROM meadows, cows
   WHERE cows.meadow_id = meadows.id;

but that ended up returning so much data (almost all of it repeated)
that reading it from the database, transferring it across the network,
and then unpacking it into perl structures took far too long. In the end
we made the code faster by not pre-fetching and letting DBIx::Class
follow the relationship (and issue a query) whenever we wanted
related rows:

  SELECT * FROM meadows;
  foreach (@meadows) {
    SELECT * FROM cows WHERE meadow_id = $_->id;
  }

That makes $#meadows + 2 queries instead of 1, but returns so much less
data that it's quicker anyway. We could make the code faster and have
fewer queries (although not just one) if we could do something like
this:

  # few rows, but each row is really really big
  SELECT * FROM meadows
  # each row is small but there are a lot of them
  @cows    = SELECT * FROM cows WHERE meadow_id in (map { $_->id } @meadows)
  iterate over @cows attaching them to appropriate meadow records

That returns exactly the same amount of data as the second version, but
in two queries. It's O(1) instead of O(N). There will be circumstances
where this is faster.

-- 
David Cantrell | Hero of the Information Age

     I'm in retox