[Dbix-class] some answers about the benchmarks

Mon Jul 9 01:29:22 GMT 2012

>>>>> "Dami" == Dami Laurent (PJ) <laurent.dami at justice.ge.ch> writes:

    Dami> - To me, the main lesson is : "beware of preconceived
    Dami> beliefs". From the doc or from DBIC tutorials, many people
    Dami> believe to make their apps faster by using HashRefInflator
    Dami> and/or prefetch; however the benchmarks show that this is not
    Dami> true, esp. with prefetch which apparently could make your app
    Dami> much slower.

These are not preconceived beliefs:

prefetch is recommended as an improvement for the cases where DBIC's
engine will issue one query for each row in the result set, when
accessing data from a related table, through no fault of it's own, but
merely because it's computationally impossible and inneficient to guess
that a join needed to be performed at the database level (as David
Cantrell has mentioned in a previous reply). prefetch is *not*
recommended as a CPU optimization. prefetch does, however, immensely
improve database performance in this case and your own benchmarks
demonstrate this.

HashRefInflator is recommended as an improvement for when people wish to
bypass the overhead involved in the object inflation process which is
indeed slower because of column inflation and other ORM features. Your
own benchmark demonstrates this by running prefetch, which inflates data
into objects N^3 times and is therefore much slower than
HashRefInflator.

Also, the prefetch test exploded into 600 MB of memory on my machine,
probably because prefetch loads large amounts of rows into memory
whereas non-prefetch runs a query for each row and throws away the
results after they're printed. I've modified the benchmark, limiting the
amount of rows to 10000 makes prefetch run faster than plain, my bet is
that the memory management overhead might be whats slowing things down
here. Then again, inflating 100k^3 objects into memory isn't a realistic
scenario. Fix your benchmarks to a scenario that's fair for both
libraries before you start spreading FUD.

    Dami> - Also beware of preconceived beliefs when stating something
    Dami> like.  "DBIDM is a bit faster yes, but it's also less flexible
    Dami> and robust".

Again, not preconceived. Having contributed to the design and
implementation of DBIC, and looking at DBIDM's code, design decisions,
and experimented with it in a project, I guarantee you, there is a *lot*
lacking on DBIDM's end design-wise. It *is* very fast and speed-focused
but it fails to abstract the storage backend the way DBIC does, which
makes it useful for not more than a handful of very specific
scenarios. One could only expect that from an optimized tool, given
speed is inversely proportional to generality. DBIC's design goal is to
establish an immutable/curried/lazy API backed by a reasonably
not-so-slow engine which you can later optimize at specific bottlenecks
of your system and indeed does not perform well by default for the cases
where DBI's fetchrow_arrayref() would suffice (your benchmark), because
for those cases, one should just use DBI. That said, the code you
implemented in your benchmark has little or no relevance in DBIC's
target application scope. This is something beginners and casual users
should be aware of, but it's so abstract that it's hard to explain
succintly in the docs.

    Dami>   Here is not the place to argue

If not here, where?

    Dami> but to anybody interested, go to the doc
    Dami> (esp. https://metacpan.org/module/DBIx::DataModel::Doc::Design
    Dami> ), do some experiments, and find out by yourself.

Or you can ask someone on the list or in the support channel for help
and they'll point you to ways of achieving similar speeds to the ones
posted in the benchmark code, they might even point you to DBI or DBIDM
or whatever, if that happens to be the best solution for the case.

-- 
Eden Cardim
+55 11 9644 8225
Shadowcat Systems Ltd.