[Dbix-class] (no subject)

Mon Jul 2 13:07:13 GMT 2012

>>>>> "Dami" == Dami Laurent (PJ) <laurent.dami at justice.ge.ch> writes:

    Dami> Hi DBIC list,
    Dami> For info, I gave a talk at the French Perl Workshop 2012
    Dami> about comparing DBIx::Class (DBIC) and DBIx::DataModel (DBIDM);
    Dami> at this occasion I did a few benchmarks that may be worth sharing
    Dami> with you :

    Dami> Extract & print 2 columns from a single table (109349 rows)
    Dami>   - raw DBI                    0.43 secs
    Dami>   - DBIC regular              11.09 secs
    Dami>   - DBIC hashref inflator     10.06 secs    
    Dami>   - DBIC 'raw data' (cursor)   4.48 secs
    Dami>   - DBIDM regular              4.00 secs
    Dami>   - DBIDM fast statement       2.25 secs

    Dami> Join 3 tables & print 4 columns from the join (113895 rows)
    Dami>   - raw DBI                                     1.36 secs
    Dami>   - DBIC regular                               46.70 secs
    Dami>   - DBIC, join & +columns                      15.50 secs
    Dami>   - DBIC, join & +columns, hashref inflator    14.17 secs
    Dami>   - DBIC, join & +columns, 'raw data' (cursor)  6.59 secs
    Dami>   - DBIC, prefetch                            146.29 secs
    Dami>   - DBIDM regular                               5.01 secs
    Dami>   - DBIDM fast statement                        3.28 secs

    Dami> I was not surprised to find out that DBIC is slower
    Dami> than DBIDM :-) ; however, I was quite surprised to 
    Dami> find out that, among DBIC mechanisms :

DBICDM is a bit faster yes, but it's also less flexible and robust, it's
a well-known trade-off. You're comparing an F1 car with a Jeep on a
paved highway. :)

    Dami> a) 'HashRefInflator', often advocated as being the fast way to get
    Dami>    data from DBIC, actually doesn't seem to bring any significant
    Dami>    benefit.

Indeed, for the use case you're benchmarking, there is no purpose or
advantage to using DBIC because you're just printing out the values in
the same tabular format you're obtaining from the database. The extra
work goes into collapsing the cartesian products into an actual data
structure. Assuming you don't need this structure, there is no case for
using DBIC or any other ORM. You should try the same benchmarks and
print out a tree-like structure.

    Dami> b) 'prefetch', also advocated for doing speed improvements, indeed 
    Dami>    does its job of sparing queries to the database, but then has  
    Dami>    such a high cost in handling the retrieved data that it becomes
    Dami>    the most expensive method.

The way you're querying and displaying the results is what's responsible
for the cost of that benchmark, given you're building N^3 objects
throughout the entire run, and you're invoking a method for displaying
the columns. You should get a speed improvement by using
$row->get_columns instead.

    Dami> c) 'cursor', which goes directly to the DBI layer and therefore
    Dami>    loses all ORM features for the retrieved data, nevertheless 
    Dami>    adds a significant cost over raw DBI.

Of course it adds cost, if you're using something like DBIC, it's
because you need structure in your fetched data, if you use raw DBI,
you're just delaying the building of a data structure with the fetched
data. Of course, this is useful for dodging a benchmark. But in a real
application, when you start passing that data around throughout the
various layers, you're gonna need the structure and you'll be penalized
anyway. DBIC is designed to be a fast kick-start for developing an
application and then optimizing the bottlenecks, for this case, you can
add your raw DBI code in a custom cursor if you need the speed.

    Dami> Since I'm not an expert of DBIC, I may well have done something wrong
    Dami> in those benchmarks; so please correct me if necessary.
    Dami> The source code is at https://github.com/damil/compare-ORM

    Dami> The FPW12 talk also discussed various design aspects; the slides are at
    Dami> http://www.slideshare.net/ldami/dbixclass-vs-dbixdatamodel.

It might be worth mentioning that slide 14 has a few
erroneous/incomplete statements regarding DBIC schema declaration:

- one file for each class -- there is currently no constraint besides
  the perl language syntax/semantics on the amount of files per class,
  you can declare your result classes in a single file if you with to do
  so.

- regular perl classes -- DBIC supports dynamic class creation and is
  even capable of introspecting your database schema and deriving all
  relationships from it automatically (except many to many, for obvious
  reasons) without any manual intervention besides the configuration of
  the connection with the database, this compensates for the two-way
  relationship declarations in most scenarios. The dynamic generation
  approach is generally not recommended because it adds several
  maintainance complexities which make it inviable for use in industrial
  scale applications (which generally evolve spontaneously from "simple"
  applications). It might be worth mentioning that DBIC also supports a
  "hybrid" approach, where you generate the classes dynamically and then
  add static modifications to it.

- full column info -- DBIC also supports plain column-name-only
  declaration.

-- 
      Eden Cardim         Need help with your Catalyst or DBIx::Class project?
      Code Monkey                 http://www.shadowcat.co.uk/catalyst/
 Shadowcat Systems Ltd.    Want a managed development or deployment platform?
 http://edencardim.com            http://www.shadowcat.co.uk/servers/