[Dbix-class] Fwd: MySQL query plan affecting DBIC structure (bug?)

Peter Rabbitson rabbit+dbic at rabbit.us
Wed Jan 27 10:50:59 GMT 2010


Paul Makepeace wrote:
> On Thu, Jan 21, 2010 at 00:34, Peter Rabbitson <rabbit+dbic at rabbit.us> wrote:
>> Paul Makepeace wrote:
>>> We've been debugging an awkward fail where the same query on the same
>>> DBIC version (8115) on the same mysql server is producing different
>>> results off two databases with very similar data (I'll expand on
>>> this).
>>>
>>> The SQL that DBIC produces has mysql returning the same dataset but in
>>> a different order; verified on the command line. Now, if in our DBIC
>>> query we add in an order_by => [..., 'me.uid'] (PK) the output from
>>> mysql is identical, as is DBIC's returned result. I dug in and saw
>>> that the SQL query plan is quite different which would explain the
>>> ordering difference. It seems like DBIC is being sensitive to order
>>> where it shouldn't be.
>>>
>>> Here's the query,
>>>      $c->model('DBIC_Readonly')->resultset('StoryCat')->search(
>>>      $search,
>>>      {
>>>        prefetch => [
>>>          'company',
>>>          'impact',
>>>          'favourability',
>>>          'topic',
>>>          {
>>>            'story' => [
>>>              { 'story_publication'  =>
>>>                [ 'region', 'publication_type' ]
>>>              },
>>>              { 'author_story' => 'author' }
>>>            ]
>>>          }
>>>        ],
>>>        order_by => 'story.date_time ASC',
>>>      });
>>>
>>> The generated query is inserting its own "order by author_story.story_uid"
>>>
>>> __PACKAGE__->table('author_story');
>>> __PACKAGE__->add_columns(qw/
>>> author_uid
>>> story_uid
>>> is_primary
>>> /);
>>> __PACKAGE__->set_primary_key(qw/author_uid story_uid/);
>>> __PACKAGE__->resultset_class('IDL::ResultSet::AuthorStory');
>>> __PACKAGE__->belongs_to(author => 'IDL::Schema::Author', 'author_uid');
>>>
>>> Is DBIC attempting to do with "order by author_story.story_uid" what
>>> we're hacking with "me.uid"?
>>>
>>> So what if anything can we do to help debug this? Is there an obvious
>>> mistake here? Do you need to see all the rels?
>> Not the rels, but the generated SQL itself. Also hilight *in the SQL*
>> what is bothering you. This is not a bug, but in order to explain to
>> you what happens and why I need the actual SQL.
> 
paste replaced with actual code)
> SELECT 
>    me.uid, me.idl_analyst_uid, me.story_uid, me.company_uid, me.industry_uid, me.subject_uid, me.subject_sentiment_uid, me.synopsis, me.weighting_uid, me.relevance, me.commentator, me.commentator_role, me.commentator_statement, me.story_commentator_uid, me.analyst_comment, me.test, 
>    company.uid, company.short_name, company.long_name, company.is_client, company.region_uid, company.industry_uid, company.fd_market_cap, company.fd_turnover_last_fy, company.login_name, company.display_name, company.password, company.search_alias, company.created, company.updated, company.parent, company.search_string, company.archived, company.added_by_user_uid, company.unchecked,
>    impact.uid, impact.parent_uid, impact.value, impact.hide, impact.alias, impact.order_priority, impact.show_in_lists, impact.data_type, impact.cascade_data_type, impact.description,
>    favourability.uid, favourability.parent_uid, favourability.value, favourability.hide, favourability.alias, favourability.order_priority, favourability.show_in_lists, favourability.data_type, favourability.cascade_data_type, favourability.description,
>    topic.uid, topic.parent_uid, topic.value, topic.hide, topic.alias, topic.order_priority, topic.show_in_lists, topic.data_type, topic.cascade_data_type, topic.description,
>    story.date_time, story.uid, story.headline, story.hyperlink, story.major_news_flag, story.publication_uid, story.body_text, story.external_uid, story.extract, story.xml_import_job_uid, story.source_uid, story.source_supplier_uid, story.doc_ref, story.test, story.status, story.added_by_user_uid, story.snippets, story.section, story.page, story.ave,
>    story_publication.uid, story_publication.name, story_publication.short_name, story_publication.url, story_publication.languages, story_publication.description, story_publication.publication_type_uid, story_publication.region_uid, story_publication.channel_type_uid, story_publication.circulation, story_publication.ave, story_publication.publisher_uid, story_publication.comments, story_publication.last_checked_by_uid, story_publication.circulation_estimate_flag, story_publication.circulation_last_updated, story_publication.circulation_update_history, story_publication.circulation_comments, story_publication.ave_estimate_flag, story_publication.ave_last_updated, story_publication.ave_update_history, story_publication.ave_comments, story_publication.last_checked_datetime,
>    region.uid, region.parent_uid, region.value, region.hide, region.alias, region.order_priority, region.show_in_lists, region.data_type, region.cascade_data_type, region.description,
>    publication_type.uid, publication_type.parent_uid, publication_type.value, publication_type.hide, publication_type.alias, publication_type.order_priority, publication_type.show_in_lists, publication_type.data_type, publication_type.cascade_data_type, publication_type.description,
>    author_story.author_uid, author_story.story_uid, author_story.is_primary, author.uid, author.forename, author.surname, author.middle_initial, author.author_type_uid, author.broker_uid
>   FROM story_cat me
>   JOIN company company ON company.uid = me.company_uid
>   JOIN code_tree impact ON impact.uid = me.weighting_uid
>   JOIN code_tree favourability ON favourability.uid = me.subject_sentiment_uid
>   JOIN code_tree topic ON topic.uid = me.subject_uid
>   JOIN story story ON story.uid = me.story_uid
>   JOIN publication story_publication ON story_publication.uid = story.publication_uid
>   LEFT JOIN code_tree region ON region.uid = story_publication.region_uid
>   LEFT JOIN code_tree publication_type ON publication_type.uid = story_publication.publication_type_uid
>   LEFT JOIN author_story author_story ON author_story.story_uid = story.uid
>   LEFT JOIN author author ON author.uid = author_story.author_uid
> WHERE ( ( me.company_uid IN ( 552, 878 ) AND ( story.date_time <= '2009-12-10' AND story.date_time >= '2009-12-10' ) AND story.status = 'published' ) )
> ORDER BY story.date_time ASC, author_story.story_uid;
> 
> There's no specific issue with the SQL per se (altho the author_story.story_uid
> isn't something we were expecting), it's DBIC's apparent failure to turn it back
> into the appropriate data structure depending on the order of the results returned
> from the DB. We can coax DBC into doing the right thing by adding a 'me.uid' to
> the 'order by'. 
> 
> Thanks,
> Paul

This is a side effect of a has_many prefetch - an order_by comprising the
right-side-table PKs is tacked to the *end* of the current order list,
in order to aid with proper collapsing. There is a branch that almost
totally reworks collapsing and *maybe* the order_by will not be necessary
any more, but it still needs some TLC, and other things pushed it into the
long queue. In any case - for the time being just add a full order-chain,
which will be prepended to the auto-generated order list.

Cheers



More information about the DBIx-Class mailing list