MySQL Forums
Forum List  »  Performance

WHERE IN() vs. BETWEEN
Posted by: Jason Scharlach
Date: October 09, 2008 10:09AM

I've been fighting with queries on a relatively large table for the past few days and have come across behavior that I'm hoping somebody can explain to me.

The table in question is:
CREATE TABLE `t` (
`d` date NOT NULL default '0000-00-00',
`src` char(3) NOT NULL default '',
`dst` char(3) NOT NULL default '',
`tot` decimal(10,1) default NULL,
PRIMARY KEY (`d`,`src`,`dst`,`tot`),
) ENGINE=MyISAM DEFAULT CHARSET=latin1

The table has approx. 21 million rows. Dates are always stored in YYYY-MM-01 format (dates are only applicable to months).

Here is the query:
SELECT * FROM t
WHERE
d BETWEEN "2006-01-01" AND "2006-12-01"
# d IN (
# "2006-01-01","2006-02-01","2006-03-01","2006-04-01","2006-05-01","2006-06-01",
# "2006-07-01","2006-08-01","2006-09-01","2006-10-01","2006-11-01","2006-12-01" )
AND src IN ( "aaa","bbb", ... ,"zzz" )
AND dst IN ( "aaa","bbb", ... ,"zzz" )
ORDER BY tot
LIMIT 10

A pretty simple query all in all - just returning the top 10 'tot' rows for a given date range, 'src' array and 'dst' array.

Now, for the confusion. There are large differences whether you are using BETWEEN or IN() for the date range (see the two methods in the query above - one is commented out).

When using BETWEEN the profiler says the majority of time is spent in "Sorting Result"

When using WHERE IN() the profiler says the majority of time is spent in "statistics" and "preparing"

So, if you have small arrays for 'src' and 'dst' the WHERE IN() method is much faster then using BETWEEN(). In fact BETWEEN() seems to have a constant time associated with it about about 14 seconds on my system all spent in "Sorting Data" no matter the size of the arrays.

However, once your arrays start getting large WHERE IN() becomes extremely slow and BETWEEN() becomes the much faster option.

My question is why? How is the optimizer treating these two queries so differently when the actually equate to the exact same thing?

Options: ReplyQuote


Subject
Views
Written By
Posted
WHERE IN() vs. BETWEEN
31452
October 09, 2008 10:09AM
5103
October 12, 2008 01:22PM
3971
October 14, 2008 07:14AM
3302
October 14, 2008 10:27PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.