Performance of Dependent Subquery
I'm trying to figure out why this query is taking so long with a sub-query. Here's the version with a sub-query:
This query takes: 1.34 seconds
SELECT Week, SUM(Value1) AS Value1
FROM v3_weeklyStats
WHERE ObjectID = ANY (SELECT ArtworkID FROM art_basics WHERE AccountID=235) AND Week >= YEARWEEK(SUBDATE(CURDATE(), INTERVAL 84 DAY))
AND Class='art'
GROUP BY Week
ORDER BY Week DESC
Explain:
+--------+-------------+--------+--------+---------------+--------+---------+--------+--------+--------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+--------+-------------+--------+--------+---------------+--------+---------+--------+--------+--------+
| 1 | PRIMARY | v3_weeklyStats| range | Class,Week | Week | 3 | [NULL] | 48034 | Using where|
| 2 | DEPENDENT SUBQUERY| art_basics| unique_subquery| PRIMARY,AccountID| PRIMARY| 8 | func,const| 1 | Using index; Using where|
+--------+-------------+--------+--------+---------------+--------+---------+--------+--------+--------+
But this Query, where I have eliminated the sub-query and just ran the query with the ID's: (This Query Takes .11 seconds)
SELECT Week, SUM(Value1) AS Value1
FROM v3_weeklyStats
WHERE ObjectID IN (13971,6478,6477,6476,6475,6474,6473,6472,6471,6470,6469,6468,6466,6464,6463,6462,6461,6460,6459,6458,6457,6455,6453,6452,6450,6449,6448,6447,6446,6445,6444,6443,6442,6441,6255,6254,6253,6252,6251,6250,6249,5950,5949,5948,5947,5946,5945,5944,5391,5390,5389,5388,5387,1810,1809,1808,1807,1806,1804,1803,1802,1801,1800,1799,1798,1797,1796,1795,1794,1792,1791,1790,1789,1788,1787,1786,1785,1784,1783,1782,1780,1779,1778,1777,1776,1775,1774,1772,1771,1770,1593,1592,1589,1587,1585,1584,1479,1478,1477,1476) AND Week >= YEARWEEK(SUBDATE(CURDATE(), INTERVAL 84 DAY)) AND Class='art'
GROUP BY Week
ORDER BY Week DESC
Produces This Explain:
+--------+-------------+--------+--------+---------------+--------+---------+--------+--------+--------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+--------+-------------+--------+--------+---------------+--------+---------+--------+--------+--------+
| 1 | SIMPLE | v3_weeklyStats| range | ObjectID,Class,Week| ObjectID| 4 | [NULL] | 2009 | Using where; Using temporary; Using filesort|
+--------+-------------+--------+--------+---------------+--------+---------+--------+--------+--------+
I guess my question is why? I've read that sub-queries cache the data when possible, there isn't any reason any of the subquery data should be changing. The first sub-query is using different keys, deciding to go through all 47k records, whereas the second query uses the ObjectID field and only has to process 2009 records...
Anyone know why? I'd like to use the sub-query, I know I can do this other ways which would be faster, but sub-queries are much nicer to use...
THANKS!