Many Thanks.
* How many rows are generated by the subquery? (If a lot, then need to focus on it.) Yes your correct i have redone my query based on just using the subquery.
* (minor) INT -> SMALLINT UNSIGNED -- In many cases you don't need a 4-byte, +/- 2 billion range value, you could use a 2-byte, 0..64K value. I can sqap these however i am querying a live server so cannot make changes to this as of yet. Hopefully this wont cause too much of an issue.
* NOT NULL -- good. (Some fields are NULLable; perhaps an oversight?) Just in the siterates table yes, thats what happens when you have other developers working on things for you :(
* Need to see the current version of the SELECT. (Did you switch to IN? Which LEFTs are gone?) Yes, this is the subquery only, not sure if you wanted code round this too.
EXPLAIN SELECT SQL_NO_CACHE SUM(d.`DataVal`)/1 AS Total, g.GroupID, g.`GroupName`, g.`GroupDescription`, `d`.`EntryDate` AS `Date`,
SUM(IFNULL(`sr`.Rate, 0) * (d.DataVal/1)) AS `Rate`, SUM(IFNULL(`sr`.Rate, 0) * (d.DataVal/1)) AS `TotalCost`
FROM master.`channelgroup` cg JOIN master.`groups` g ON cg.`GroupID` = g.`GroupID` JOIN `data` d
ON cg.ChannelID = d.`ChannelID` LEFT OUTER JOIN `master`.`siterates` `sr` ON `d`.`SiteRateID` = `sr`.`SiteRateID`
WHERE (g.`GroupID` IN (157,78,87,79,91,81)) AND ((d.EntryDate BETWEEN '2013-01-01 00:00:00' AND '2013-02-15 23:59:59'))
GROUP BY g.GroupID, LEFT(d.EntryDate, 16)
* Need to see the new EXPLAIN. Please put [ code ] and [ /code ] (without spaces) around it.
"id" "select_type" "table" "type" "possible_keys" "key" "key_len" "ref" "rows" "Extra"
"1" "SIMPLE" "cg" "ALL" \N \N \N \N "8089" "Using where; Using temporary; Using filesort"
"1" "SIMPLE" "g" "eq_ref" "PRIMARY" "PRIMARY" "4" "eddie.cg.GroupID" "1" ""
"1" "SIMPLE" "d" "ref" "PRIMARY,FK_data_ChannelID" "FK_data_ChannelID" "4" "eddie.cg.ChannelID" "806084" "Using where"
"1" "SIMPLE" "sr" "eq_ref" "PRIMARY" "PRIMARY" "4" "slave.d.SiteRateID" "1" ""
* There are only hundreds or thousands of rows in each table, correct?
Groups - hundreds but could be thousands,
channelgroups - So far 8000 but thats for one customer and could eventually have 10000 customers but not all as big so could be a good few million.
data - Each customer would have a data table and join the master db of which all the above tables join too. data table for each customer could be anything from 4m to 15billion depending on the customer size. Current customer for testing is 14 million.
I have been investigating maybe using merge tables for each years worth of data as i would only ever query a months worth of data MAX so maybe this would improve performance??? Also make it easier to archive a years worth of data too and to unarchive it if needed. Even still i need to optimise the above.
Thanks again for your time.
Edited 4 time(s). Last edit at 02/23/2013 04:10PM by Paul Smith.