[Dbix-class] MySQL "OR" Performance for Large Query Sets

demerphq demerphq at gmail.com
Sun Nov 16 12:48:39 GMT 2008


2008/11/16 Jon Schutz <jon+dbix at youramigo.com>:
> I've just done some benchmarking on MySQL 5 for queries where you need
> to select back many individual records, e.g. you could use
>
> SELECT ... WHERE id=? (and iterate for all possible id values)
>
> or
>
> SELECT ... WHERE id=1 OR id=2 OR id=3 OR ...
>
> or
>
> SELECT ... WHERE id IN (1, 2, 3, ...)
>
> and there are several other ways to do it.
>
> What I found was that using OR, at a certain point (when selecting >
> about 1000 keys) the performance went exponentially sour - like, to
> quote one benchmark, the OR query took 6400 secs vs 30s typical for
> other methods.  I've written the results up in more detail at
> http://notes.jschutz.net/19/perl/mysql-many-row-select-performance -
> have a look at the graph if nothing else.  Using 'IN' is the best
> solution for any sort of large key set.
>
> I figure this is relevant to DBIx::Class since it uses SQL::Abstract
> which will translate a search spec such as A => [ 1 .. 1000 ] into (A=?
> OR A=? ... ) 1000 times, which could be a real gotcha (indeed, some of
> my own code, which I shall now be reviewing, can expect arrays of tens
> of thousands of keys...).
>
> I was wondering - has anyone else seen similar behaviour?

Yes. In some cases or is faster (older MySQL servers) in some cases IN
is faster (newer MySQL). In most cases they ARENT equivalent, the
optimiser is just not smart enough, nor designed for this type of
equivalence conversion. Also, with either, at a certain point the
optimiser figures out it cant use the indexes on the column and turns
things into a table scan, and with the IN, it can construct an in
memory hash, and check each value in O(1) time, where as with the OR
approach it doesnt, and thus operates in O(k) time.

If you are generate IN lists of 10s of thousands of items I bet you
would be better to create an indexed temporary table containing the
list, and then use join syntax.

Yves


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"



More information about the DBIx-Class mailing list