[Dbix-class] MySQL "OR" Performance for Large Query Sets
Jon Schutz
jon+dbix at youramigo.com
Sun Nov 16 12:41:02 GMT 2008
I've just done some benchmarking on MySQL 5 for queries where you need
to select back many individual records, e.g. you could use
SELECT ... WHERE id=? (and iterate for all possible id values)
or
SELECT ... WHERE id=1 OR id=2 OR id=3 OR ...
or
SELECT ... WHERE id IN (1, 2, 3, ...)
and there are several other ways to do it.
What I found was that using OR, at a certain point (when selecting >
about 1000 keys) the performance went exponentially sour - like, to
quote one benchmark, the OR query took 6400 secs vs 30s typical for
other methods. I've written the results up in more detail at
http://notes.jschutz.net/19/perl/mysql-many-row-select-performance -
have a look at the graph if nothing else. Using 'IN' is the best
solution for any sort of large key set.
I figure this is relevant to DBIx::Class since it uses SQL::Abstract
which will translate a search spec such as A => [ 1 .. 1000 ] into (A=?
OR A=? ... ) 1000 times, which could be a real gotcha (indeed, some of
my own code, which I shall now be reviewing, can expect arrays of tens
of thousands of keys...).
I was wondering - has anyone else seen similar behaviour?
--
Jon Schutz My tech notes http://notes.jschutz.net
Chief Technology Officer http://www.youramigo.com
YourAmigo
More information about the DBIx-Class
mailing list