Query optimizer fails terribly.
Hello,
I have developed a forum where the messages are stored in a table named frmsg (schema at the bottom). One of the queries I run very often selects the first 40 top-level messages of a specific forum:
SELECT
f.frmsg_id, f.frmsg_frm_id, f.frmsg_parent, f.frmsg_from_usr,
DATE_FORMAT(f.frmsg_date, '%Y%m%d %H:%i') as date,
f.frmsg_approved, f.frmsg_visible, f.frmsg_sticky,
f.frmsg_views, f.frmsg_subject, f.frmsg_body,
f.frmsg_replies, f.frmsg_lastreply, f.frmsg_stkey, f.frmsg_lrkey,
r.frmsg_from_usr as lastreply_from,
DATE_FORMAT(r.frmsg_date, '%Y%m%d %H:%i') as lastreply_date,
usr_username, usr_sex, usr_flags, usr_mainphoto
FROM frmsg f
JOIN usr ON f.frmsg_from_usr = usr_username
LEFT JOIN frmsg r ON f.frmsg_lastreply = r.frmsg_id
WHERE f.frmsg_parent IS NULL AND f.frmsg_visible = 1 AND f.frmsg_frm_id = '1'
ORDER BY f.frmsg_id DESC
LIMIT 0, 39;
The idx_frmsg_topics_date index is created specifically for this query, it contains all 3 fiels in the where clause and the order by field in the end. Obviously it is perfect for the query. Yet, the query optimizer selects the idx_frmsg_parent index leading to extra seeks for the where fields and a filesort!
EXPLAIN:
1 SIMPLE f ref idx_frmsg_topics_date,idx_frmsg_topics_st,idx_frms... idx_frmsg_parent 5 const 8194 Using where; Using filesort
1 SIMPLE usr eq_ref PRIMARY PRIMARY 20 zoo.f.frmsg_from_usr 1
1 SIMPLE r eq_ref PRIMARY PRIMARY 4 zoo.f.frmsg_lastreply 1
With this strategy the query takes up to 4 seconds depending on load. If I use FORCE INDEX(idx_frmsg_topics_date) then it executes in 0.0086 secs! Correct me if I'm wrong but this seems a very easy case for mysql's query optimizer and I don't understand why it fails so miserably. My only explanation is that the expected rows with both indexes are more or else the same so it prefers the much smaller idx_parent index. But then it completely ignores that idx_frmsg_topics_date provides the records already sorted, and we just need the first 40 of them.
Lately I keep finding cases where I have to use FORCE INDEX to select the right index. But I don't thing it's a good practice, you have to be very careful to force the index only in the right case. Do you have any idea why the optimizer performs so poorly? In you experience, do you often need to use FORCE INDEX?
Kostas
PS. ANALYZE TABLE didn't help at all.
CREATE TABLE frmsg
(
frmsg_id INT NOT NULL
AUTO_INCREMENT,
frmsg_frm_id INT NOT NULL,
frmsg_parent INT NULL,
frmsg_from_usr VARCHAR(20) NOT NULL,
frmsg_replies SMALLINT NOT NULL,
frmsg_lastreply INT NULL,
frmsg_date DATETIME NOT NULL,
frmsg_approved BIT NOT NULL,
frmsg_visible BIT NOT NULL,
frmsg_sticky BIT NOT NULL,
frmsg_views SMALLINT NOT NULL,
frmsg_stkey INT NULL, -- stickies key
frmsg_lrkey INT NULL, -- last reply key
frmsg_subject TEXT NOT NULL,
frmsg_body TEXT NOT NULL,
FULLTEXT (frmsg_subject, frmsg_body),
CONSTRAINT pk_frmsg
PRIMARY KEY (frmsg_id)
);
ALTER TABLE frmsg ADD INDEX idx_frmsg_topics_st(frmsg_frm_id, frmsg_stkey);
ALTER TABLE frmsg ADD INDEX idx_frmsg_topics_lr(frmsg_frm_id, frmsg_lrkey);
ALTER TABLE frmsg ADD INDEX idx_frmsg_topics_date(frmsg_frm_id, frmsg_parent, frmsg_visible, frmsg_id);
ALTER TABLE frmsg ADD INDEX idx_frmsg_parent(frmsg_parent);
ALTER TABLE frmsg ADD INDEX idx_frmsg_from(frmsg_from_usr, frmsg_frm_id, frmsg_sticky);
ALTER TABLE frmsg ADD INDEX idx_frmsg_flags(frmsg_approved, frmsg_visible, frmsg_sticky);