MySQL Forums
Forum List  »  Performance

Re: Using index makes query slower than table scan
Posted by: Per Engwall
Date: October 31, 2005 07:07AM

Thanks Daniel and Hubblespace for taking the time, highly appreciated.

I know the query spans over a large number of records. Since dimensions type, year and period only contain few members one way to work around this is to create additional fact tables with fewer dimensions thus reducing the number of records substantially. However my hopes were to be able to avoid this solution.

The database layout is a simple star scheme with a couple of fact tables and a total of 11 dimensions where one of the fact tables contain all dimensions. Regarding index design I rely on the recommendations proposed by Bert Scalzo for use when query types are not known in advance.

Quote from presentation:
"Fact tables should have primary keys – for data load integrity
Fact table dimension reference (i.e. foreign key) columns should each be individually indexed – for variable fact/dimension joins
Dimension tables should have primary keys
Dimension tables should be fully indexed"

I'm new to using MySQL so I did some reading and found that MyISAM engine does not use clustered primary keys which explains why using the index in this case is slower than a table scan. What I don't understand is why the query optimizer doesn't take this fact into consideration, especially since table and index info is up to date, and performs a table scan instead. I can build a solution which keeps index distribution for some columns and when necessary include IGNORE INDEX (PRIMARY) in the query which seems like stupid way to force a certain behaviour.

My previous experiment with InnoDB did not include defining a primary key so after reading that primary keys for InnoDB are in fact clustered I tried this and got acceptable performance for the example query. This is expected since now only a portion of the table has to be scanned due to least changing columns comes first in key. Overall performance using InnoDB engine is still lower than using MyISAM, all query types considered.

So the current solution is to use MyISAM with no primary key, but with single column idexes for all dimensions. This way the optimizer seems to know when to use an index and when to perform a table scan. As soon as a primary key is defined the optimizer chooses to process large spanning queries using this index instead of a using table scan thus degrading performance about 10 times. I don't know if this is a bug or not or whether some setting exist to change the behaviour, from my point of view it's a bug :-)

Thanks again!
/Per

Options: ReplyQuote


Subject
Views
Written By
Posted
2116
October 25, 2005 07:36AM
Re: Using index makes query slower than table scan
1409
October 31, 2005 07:07AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.