I am getting odd behavior w/ MySQL 5.6.13 (though I've reproduced it w/ Percona 5.5.31, as well) on tables that are partitioned by the WEEKDAY function of a datetime column.
First, bit of background for the tables. These are mobile application statistic tables that have a high volume of inserts, but we only use the last 48 hours of data. So we have a job that runs every night to truncate the partition for WEEKDAY(now()) -2 to purge the data after we no longer need it.
The behavior I'm running into is that for certain range based on our datetime column, the optimizer thinks that there is no/limited data in the partition.
Example w/ smaller test table:
Quote
root@local-3306 [debuglog_test]>alter table stats_201402 PARTITION BY LIST (WEEKDAY(created))
-> (PARTITION p0 VALUES IN (0) ENGINE = InnoDB,
-> PARTITION p1 VALUES IN (1) ENGINE = InnoDB,
-> PARTITION p2 VALUES IN (2) ENGINE = InnoDB,
-> PARTITION p3 VALUES IN (3) ENGINE = InnoDB,
-> PARTITION p4 VALUES IN (4) ENGINE = InnoDB,
-> PARTITION p5 VALUES IN (5) ENGINE = InnoDB,
-> PARTITION p6 VALUES IN (6) ENGINE = InnoDB);
Query OK, 426763 rows affected (3 min 14.18 sec)
Records: 426763 Duplicates: 0 Warnings: 0
root@local-3306 [debuglog_test]>explain SELECT count(*) as cnt FROM debuglog_test.stats_201402 _BASE WHERE _BASE.`created` BETWEEN '2014-02-11 00:00:00' AND '2014-02-11 23:59:59';
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| 1 | SIMPLE | _BASE | range | created | created | 5 | NULL | 1 | Using where; Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.08 sec)
root@local-3306 [debuglog_test]>explain SELECT count(*) as cnt FROM debuglog_test.stats_201402 _BASE WHERE _BASE.`created` BETWEEN '2014-02-12 00:00:00' AND '2014-02-12 23:59:59';
+----+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | SIMPLE | _BASE | range | created | created | 5 | NULL | 178318 | Using where; Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
1 row in set (0.08 sec)
Both of those ranges have roughly the same amount of actual rows in them; however, the explain estimates the first as 1 and the second as 178k. When we start running more complicated queries, the optimizer starts using the wrong indexes because it uses a false estimation on the cost of using the created index. Analyzing the table or partitions does not update the cost estimation for the datetime range.
Removing the partitioning by weekday, however, does remedy this issue. It is not limited to just partitioning by list of weekdays though. I also partitioned by day of the month, via hash, list & range, and received the same results.
I've tried tweaking the various innodb_stats_X_X variables to update stats more frequently/less frequently and nothing has seemed to affect this behavior.