[Dbix-class] (no subject)

Mon Jul 9 00:20:38 GMT 2012

>>>>> "Len" == Len Jaffe <lenjaffe at jaffesystems.com> writes:

    Len> How much caching is DBIC doing under its own hood?
    Len> If we execute the same complex query several times, is any attempt
    Len> made to save the internal state of how DBIC arraived at the SQL from
    Len> the DBIC/SQLA method calls?

Yes, resultsets recycle the sth via the cursor when possible.

    Len> Likewise on the back end?

    Len> Just throwing ideas out, but along the same vein as Moose's
    Len> make_immutable, if we could signal to DBIC that a query/resultset is
    Len> not going to change but for it's bind values, we'd be able to get the
    Len> developers' advantage of the ORM, and the runtime advantage of
    Len> reusing prepared statements.

    Len> Or does that already exist, and I require some book learning?

It already does that, DBIC's resultset API is immutable by design, for
exactly that reason. Every time you call ->search you actually get a
brand new cloned object with the new attributes applied, the SQL
generation for that specific resultset object is deferred until actual
data is needed by one of the calls. This is not the reason why it's
slow, most of the CPU performance hit happens at inflation.

    Len> Also, I infer form the docs (as I think Laurent may have) that there
    Len> is significant overhead in instantiating row objects, and that
    Len> hashref inflator would provide *significant* (emphasis mine) speed
    Len> imrovements over objects.

Yeah, collapsing the data into the objects' relationship hierarchy is
slow because there are a ton of things to keep track of, like column
inflation and column dirtiness state, and checking any extra columns
that might have been added by arbitrary queries. It's cheaper to inflate
via HashRefInflator than to inflate into the standard DBIC row object
*and* build the same hashref by manually calling the row accessors. For
maximum performance in that case, I would suggest flatening out the
hierarchy by building a virtual view and combining that with
HashRefInflator for that specific slow query.

This would make the code in Laurent's benchmark *much* faster. But
again, using DBIC for the scenario in those benchmarks is silly because
you're going to convert the cartesian product (you know, all those rows
with repeated values from the joined table) from the db into a
structured hierarchy (collapsing, which eliminates the dupes and builds
a data structure) then convert that back into a cartesian product. All
it's doing is printing the exact same values it could have gotten from
fetchrow_arrayref in the first place, so what's the point? If that is
the only use case in your software, then by all means do *not* use
DBIC. Most useful applications, however, require much more structure
than what a simple join offers.

-- 
Eden Cardim
+55 11 9644 8225
Shadowcat Systems Ltd.