MySQL Forums
Forum List  »  Optimizer & Parser

Explain estimate completely wrong for range when partitioned by day
Posted by: Blake Harps
Date: February 13, 2014 01:23PM

I am getting odd behavior w/ MySQL 5.6.13 (though I've reproduced it w/ Percona 5.5.31, as well) on tables that are partitioned by the WEEKDAY function of a datetime column.

First, bit of background for the tables. These are mobile application statistic tables that have a high volume of inserts, but we only use the last 48 hours of data. So we have a job that runs every night to truncate the partition for WEEKDAY(now()) -2 to purge the data after we no longer need it.

The behavior I'm running into is that for certain range based on our datetime column, the optimizer thinks that there is no/limited data in the partition.

Example w/ smaller test table:

Quote

root@local-3306 [debuglog_test]>alter table stats_201402 PARTITION BY LIST (WEEKDAY(created))
-> (PARTITION p0 VALUES IN (0) ENGINE = InnoDB,
-> PARTITION p1 VALUES IN (1) ENGINE = InnoDB,
-> PARTITION p2 VALUES IN (2) ENGINE = InnoDB,
-> PARTITION p3 VALUES IN (3) ENGINE = InnoDB,
-> PARTITION p4 VALUES IN (4) ENGINE = InnoDB,
-> PARTITION p5 VALUES IN (5) ENGINE = InnoDB,
-> PARTITION p6 VALUES IN (6) ENGINE = InnoDB);
Query OK, 426763 rows affected (3 min 14.18 sec)
Records: 426763 Duplicates: 0 Warnings: 0

root@local-3306 [debuglog_test]>explain SELECT count(*) as cnt FROM debuglog_test.stats_201402 _BASE WHERE _BASE.`created` BETWEEN '2014-02-11 00:00:00' AND '2014-02-11 23:59:59';
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| 1 | SIMPLE | _BASE | range | created | created | 5 | NULL | 1 | Using where; Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.08 sec)

root@local-3306 [debuglog_test]>explain SELECT count(*) as cnt FROM debuglog_test.stats_201402 _BASE WHERE _BASE.`created` BETWEEN '2014-02-12 00:00:00' AND '2014-02-12 23:59:59';
+----+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | SIMPLE | _BASE | range | created | created | 5 | NULL | 178318 | Using where; Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
1 row in set (0.08 sec)

Both of those ranges have roughly the same amount of actual rows in them; however, the explain estimates the first as 1 and the second as 178k. When we start running more complicated queries, the optimizer starts using the wrong indexes because it uses a false estimation on the cost of using the created index. Analyzing the table or partitions does not update the cost estimation for the datetime range.

Removing the partitioning by weekday, however, does remedy this issue. It is not limited to just partitioning by list of weekdays though. I also partitioned by day of the month, via hash, list & range, and received the same results.

I've tried tweaking the various innodb_stats_X_X variables to update stats more frequently/less frequently and nothing has seemed to affect this behavior.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.