MySQL Forums
Forum List  »  Optimizer & Parser

Re: multicolumn index & forcing index to avoid filesort
Posted by: Rick James
Date: July 05, 2010 11:58PM

Your first cut at the INDEX was about as good as it is going to get.

Ah, the much maligned "filesort". The real problem is the complexity of the SELECT; "filesort" is just one symptom. You probably cannot eliminate 'filesort' for this query.

Suggest (won't solve the problem):
(location='Barcelona' OR location='Madrid' OR location='Malaga' OR location='Cordoba' OR location='Cadiz')
-->
location IN (location='Barcelona', 'Madrid', 'Malaga', 'Cordoba', 'Cadiz')

When you do EXPLAIN SELECT ..., note that the key_len is only long enough for the first 3 fields of
INDEX(brand,adtype,published,location,price,miles)

And your second formulation was ever worse.

This is because index usage goes something like...
0. Use only one index (but pick it well)
1. Use the first few fields of the INDEX -- any fields that are tested for '=' in the WHERE clause. (This does not include your OR, or the equivalent IN.)
2. If possible, use _one_ more field for WHERE, GROUP BY, or ORDER BY.

In your first case, it will use brand,adtype,published, then futz with `location` (maybe!). I think the IN formulation might be better than the OR forumlation. But it will stop using indexes there. (And still do a filesort.)

Your second case... It _might_ use the index for the one field `ranking`, nothing more. It might decide that "ranking>0" is not selective, so, punt, let's just do a table scan. (And a filesort)

Let's consider the selectivity of the pieces in the WHERE clause...
brand='BMW' -- pretty selective; should be early in the INDEX (as in case 1)
adtype='sell' -- not very selective?
published='yes' -- not very selective?
Still, having an index starting with all three of those is probably best. (And, actually the order of the 3 in the INDEX does not matter much.)

Moving on...
location IN
price BETWEEN
mileage BETWEEN
Each of these may (or may not) be helpful, depending on what fraction of the rows matches the clause.
Also,
ORDER BY ranking
may be used next, instead of location or price or mileage.

Suggestion: Have all these indexes, do EXPLAIN to see which one it picks:
INDEX(brand,adtype,published, location)
INDEX(brand,adtype,published, price)
INDEX(brand,adtype,published, mileage)
INDEX(brand,adtype,published, ranking)
In fact, you will find that similar looking queries (but with different ranges for location/price/mileage) will pick different indexes.

Suggestion -- Shrink the table.
INTEGER -> MEDIUMINT or SMALLINT or TINYINT
Use UNSIGNED where appropriate (on all integers?)
CHAR(25) -> VARCHAR(25) -- that way BMW will take 4 bytes, not 25.

For further discussion, I need more details, please provide
* SHOW CREATE TABLE tbl\G -- engine, indexes
* SHOW TABLE STATUS LIKE 'tbl'\G -- sizes
* EXPLAIN SELECT ...\G -- clues of inefficiencies
* SHOW VARIABLES LIKE '%buffer%'; -- cache size
and surround them with [ code ] and [ / code ]

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: multicolumn index & forcing index to avoid filesort
1719
July 05, 2010 11:58PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.