Re: Slow search on one table, many where, order by and limit
Posted by:
Rick James
Date: March 01, 2010 01:50PM
59 sec -> 4 sec. This is pretty typical improvement between uncached and cached versions of the same query. 4 seconds is probably not good enough for your use case.
Note: "Rows_examined: 403385". Compare: "Rows: 328042". This implies a "table scan", plus more. That is, every row has to be looked at to satisfy the query. To speed up the query we need to find a way to prune the effort to something less.
The extra 80K in Rows_examined probably comes from the temp table needed for the ORDER BY RAND(). Could it be that 80K rows match this?
(LOOKING_FOR_GENDERS +0 &2) >0 AND
GENDER IN ('MAN') AND
COUNTRY_ID =254 AND
(CARD_FLAGS +0 &18) >0
The 69M is all the indexes for the one table. At most one index would be used. Hence, 25M sufficed for your one experiment. In the long run, you will probably need the 100M.
In the bigger SELECT, these:
AND BIRTHDATE_YEAR <=1990 AND BIRTHDATE_YEAR >=1980
AND COUNTRY_ID =254
AND STATE_ID =126
AND CITY_ID =7275
Beg for an index beginning with (CITY_ID, BIRTHDATE_YEAR).
(I hope that CITY_ID is normalized such that Portland, Maine has a different id than Portland, Oregon.)
If, when given a city, you automatically add STATE_ID and COUNTRY_ID to the WHERE clause, then this index would be a good catchall for queries with either CITY or STATE specified:
INDEX(STATE_ID, CITY_ID)
At which point, you may as well combine my two recommendations here into:
INDEX(STATE_ID, CITY_ID, BIRTHDATE_YEAR)
That way, all of these can use this index, with probably a 10x speedup:
* state (without city)
* city (and implied state)
* city and birth range
* state and birth range (but it won't be any faster than state, alone)
I'm surprised that
KEY `COUNTRY_ID` (`COUNTRY_ID`,`STATE_ID`,`CITY_ID`),
did not kick in. (Here I need to see the EXPLAIN on that query.)
Contrary to what I suggested in earlier remarks, your 5 indexes with
INDEX(x, `LAST_VISITED_DATE`)
are probably rarely used. Why? Because x has very low "cardinality".
Possibly
INDEX(CITY_ID, `LAST_VISITED_DATE`)
would work well.