[Dbix-class] Why is this so slow?

Chris Cole chris at compbio.dundee.ac.uk
Tue Jun 8 15:27:22 GMT 2010


On 08/06/10 15:05, "Alex J. G. Burzyński" wrote:
> Hi,
>
> On 08/06/10 14:09, Chris Cole wrote:
>> Given the schema provided at the end, why is this code so slow?
>>
>> sub getCounts ($) {
>>     my $self = shift;
>>     my $dataset = shift;
>>
>>     croak "ERROR - dataset name is required\n" unless ($dataset);
>>
>>     my $nHits;
>>
>>     # make SQL search both technical replicates at the same time
>>     $dataset =~ s/t\d/t%/;
>>     my $rs = $self->resultset('NgsMappings')->search(
>>        {
>>           'me_sample' =>  {'like', $dataset},
>>        },
>>        {
>>           columns =>  [qw/mp_strand mp_ref_id.rs_name mp_start mp_end
>> mp_freq/],
>>           join =>  [qw/mp_me_id mp_ref_id/],
>>        }
>>     );
>>
>>     while (my $hit = $rs->next()) {
>>        my $freq = $hit->get_column('mp_freq');
>>        $nHits += $freq;
>>     }
>>     return($nHits);
>> }
>>
>> The equivalent SQL via DBI runs in ~1.5 min, but the above I killed
>> after 30min. The DBI code is:
>> my $sth = $dbh->prepare("SELECT mp_strand,rs_name,mp_start,mp_end,mp_freq
>>                                FROM ngs_mappings
>>                                JOIN ngs_map_exps on mp_me_id = me_id
>>                                JOIN ngs_ref_seqs on mp_ref_id = rs_id
>>                                WHERE me_sample like ?
>>                             ") or die "ERROR - prepare() statement
>> failed: ", $dbh->errstr();
>> $sth->execute($sample) or die "ERROR - execute() statement failed: ",
>> $dbh->errstr;
>> while (my $row = $sth->fetchrow_arrayref()) {
>>     $nHits += $row->[4];
>> }
>> $sth->finish();
>>
>> Any pointers appreciated.
>> Cheers,
>>
>> Chris
>
>
> As you are creating an object $hit (NgsMappings with it's relationships
> - NgsMapExps&  NgsRefSeqs) for each row - and if you have a huge number
> of rows returned it increases the time.

How is this different from the DBI example which iterates over all rows 
with the $sth->fetchrow_arrayref()? I don't see how the DBIC version 
should be slower.

BTW the query should return ~4m rows and the NgsMappings table has ~111m 
rows.

> But if this is real-life code, there's no need to iterate over each row
> just to sum up mp_freq, just let your DB to do the job:

This isn't real-life code. I require all the columns I specify in the 
SELECT, I just simplified it here.
Cheers,

Chris



More information about the DBIx-Class mailing list