Can this be optimized any futher?
Posted by:
Rob Strong
Date: February 16, 2010 02:44PM
Here is the query:
SELECT Video_has_Tag.Video_VideoID AS VideoID
FROM Account_has_Tag
JOIN Video_has_Tag ON Video_has_Tag.Tag_TagID = Account_has_Tag.Tag_TagID
LEFT JOIN VideoHistory ON Video_has_Tag.Video_VideoID = VideoHistory.Video_VideoID
AND VideoHistory.Account_AccountID =53
WHERE Account_has_Tag.Account_AccountID =53
AND ( VideoHistory.TS < DATE_SUB( NOW( ) , INTERVAL 3 DAY )
OR VideoHistory.TS IS NULL
)
GROUP BY Video_has_Tag.Video_VideoID
ORDER BY SUM( Video_has_Tag.Weight * Account_has_Tag.Weight ) DESC
LIMIT 50
Here is the info on each table:
CREATE TABLE `Account_has_Tag` (
`Account_AccountID` int(11) NOT NULL,
`Tag_TagID` int(11) NOT NULL,
`Weight` decimal(10,0) NOT NULL,
PRIMARY KEY (`Account_AccountID`,`Tag_TagID`),
UNIQUE KEY `tag_acct_index` (`Tag_TagID`,`Account_AccountID`),
KEY `Account_AccountID` (`Account_AccountID`),
KEY `Tag_TagID` (`Tag_TagID`)
)
CREATE TABLE `Video_has_Tag` (
`Video_VideoID` int(11) NOT NULL,
`Tag_TagID` int(11) NOT NULL,
`Weight` int(11) NOT NULL,
UNIQUE KEY `vid_tag_index` (`Video_VideoID`,`Tag_TagID`),
UNIQUE KEY `tag_vid_index` (`Tag_TagID`,`Video_VideoID`),
KEY `vidindex` (`Video_VideoID`)
)
CREATE TABLE `VideoHistory` (
`Video_VideoID` int(11) NOT NULL,
`Account_AccountID` int(11) NOT NULL,
`TS` timestamp NOT NULL default CURRENT_TIMESTAMP,
`IP` varchar(15) NOT NULL,
PRIMARY KEY (`Video_VideoID`,`Account_AccountID`),
UNIQUE KEY `acct_vid_index` (`Account_AccountID`,`Video_VideoID`),
KEY `Video_VideoID` (`Video_VideoID`),
KEY `Account_AccountID` (`Account_AccountID`)
)
I have tried adding all kinds of indexes but couldn't get it to be much faster. The best I got it to took about 3 seconds. The plan is to have this statement called pretty frequently so 3 seconds is too long. The indexes I have now, which are most definitely excessive, are a two column index on Account_has_Tag for Account_AccountID and Tag_TagID in both orders, as well as individual index on each of those. The same thing for Video_has_Tag for Video_VideoID and Tag_TagID. Finally VideoHistory the same two column indexes in both directions and individual columns for Video_VideoID and Account_AccountID.
Here is the explain for the query (hopefully you can make out what this says):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Account_has_Tag ref PRIMARY,tag_acct_index,Account_AccountID,Tag_TagID PRIMARY 4 const 93 Using temporary; Using filesort
1 SIMPLE Video_has_Tag ref tag_vid_index tag_vid_index 4 wrexer_zize.Account_has_Tag.Tag_TagID 13258
1 SIMPLE VideoHistory eq_ref PRIMARY,acct_vid_index,Video_VideoID,Account_Accou... PRIMARY 8 wrexer_zize.Video_has_Tag.Video_VideoID,const 1 Using where
Thanks for any suggestions.