[Dbix-class] Re: Bug in prefetch subqueries?

Thu Jul 30 13:22:14 GMT 2009

On Thu, Jul 30, 2009 at 08:17, Peter Rabbitson<rabbit+dbic at rabbit.us> wrote:
> And now we come to your semi-bug which reads:
> $f2b_artist_rs->search ({}, { prefetch => 'cd' })
>                ->first
>                  ->cds
>                    ->count
>
> The reason this returns 1, is because we execute only *one* query
> (the whole point of prefetch => ). The result of the query is a
> single row of artist/cd data, which is used to pre-populate all the
> slots for ->has_many_accessor. In order to have the above return 2
> we would have to re-query the database, effectively nullifying the
> effect of prefetch. Now, if you are hell-bent on making this
> correct and still maintain the prefetch == single query paradigm,
> you would have to alter the prefetch mechanism to detect if there
> are *any* right-side join conditions, and if this is the case do
> something like:

That's my point. The result of the query should be -two- rows of artist/cd data.

> SELECT artist.*, cd2.*
>  FROM artist artist
>  LEFT JOIN cd cd ON artist.id = cd.artistid
>  LEFT JOIN cd cd2 ON artist.id = cd2.artistid
> WHERE cd.name = ?

Or this:
SELECT artist.*, cd.*
FROM (
SELECT artist.*
FROM artist artist
LEFT JOIN cd cd ON artist.id = cd.artistid
WHERE cd.name = ?
) artist
LEFT JOIN cd cd ON artist.id = cd.artistid

> Effectively we pull an additional "shadow" join that will be
> related to the artist rs limited by the initial join, and which
> will be used for the actual object inflation. Of course in order to
> make this happen we need both introspectable SQLA, and sane multi-
> has-many collapse (actually the collapse may not be ncessary, as we
> do not select anything from the first join). And even if we did, I
> am still not sure if going to such great lengths is in fact
> justified. Prefetched rs == non-prefetched rs is a flawed concept
> imo, as it is not quite compatible with prefetch == one query only,
> which as far as I am concerned *is* the selling point of prefetch
> (db-trip latencies are the most significant bottleneck I have
> encountered during my work).

We don't need introspectable SQLA and multi-has-many collapse is an
orthogonal issue. We just need to stop conflating the prefetched
results with the main resultset.

And I didn't say "Prefetched rs == non-prefetched rs". I said that the
rows from a search should behave identically regardless of whether:
    * the rows in question came from a prefetched search or not
    * the relation being accessed was prefetched or not

And the current implementation of prefetch violates that. Consider the
following:

----------------
sub print_cds {
    my $artist = shift;

    return sprintf "%s", join( ", ", sort map { $_->name } $artist->cds ), $/;
}

my $f2b_rs = Artist->search({
    'cd.name' => 'Fade to Black'
});

my $cds = print_cds( $f2b_rs->first );

my $f2b_pref_rs = $f2b_rs->search( {}, { prefetch => 'cds' } );

my $pref_cds = print_cds( $f2b_pref_rs->first );

is( $cds, $pref_cds );
----------------

That test will currently fail and it shouldn't. Prefetch is purely an
optimization. Optimizations should never change behavior.

Rob