MySQL Forums
Forum List  »  Performance

Planner taking a bad decission when having majority of NULLs
Posted by: Guillermo Schulman
Date: May 02, 2012 10:16AM

Hi you all,

I have the following query:

SELECT COUNT(*)
FROM
rContactFormData AS rContactFormData2
LEFT JOIN formData2 ON (rContactFormData2.formDataId = formData2.formDataId)
LEFT JOIN notReplicableMultiSelect3 ON (notReplicableMultiSelect3.formDataId = formData2.formDataId)
WHERE
notReplicableMultiSelect3.formDataId IS NOT NULL AND
formData2.formDataId IS NOT NULL AND
formData2.f11 IS NULL
;

+----------+
| COUNT(*) |
+----------+
| 3540 |
+----------+
1 row in set (0.80 sec)


(the original query retrieves values from the 3 involved tables, I changed it with a COUNT(*) to make it easier and cleaner to show the results, but all the JOINS are necessary... anyway I checked that the behaviour I want to show here remained unchanged)

So, the query takes, every time I run it, 0.80 sec which seems to be too much.

This is the plan I get using EXPLAIN:
+----+-------------+---------------------------+------+---------------+------------+---------+----------------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------------+------+---------------+------------+---------+----------------------------+------+--------------------------+
| 1 | SIMPLE | formData2 | ref | PRIMARY,f11 | f11 | 5 | const | 10 | Using where; Using index |
| 1 | SIMPLE | rContactFormData2 | ref | formDataId | formDataId | 4 | apria.formData2.formDataId | 1 | Using index |
| 1 | SIMPLE | notReplicableMultiSelect3 | ref | idKey | idKey | 4 | apria.formData2.formDataId | 2 | Using where |
+----+-------------+---------------------------+------+---------------+------------+---------+----------------------------+------+--------------------------+

Apparently It is considering f11 a good key. But taking a look at the values distribution for that table, this is what I get:

SELECT f11, count(*) FROM formData2 GROUP BY f11;
+------+----------+
| f11 | count(*) |
+------+----------+
| NULL | 318594 |
| 461 | 822 |
| 462 | 539 |
| 463 | 223 |
| 469 | 148 |
+------+----------+
5 rows in set (0.11 sec)

This means that f11 column contains 99% NULLs values.
This is the cardinality:
SHOW INDEX FROM formData2;
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| formData2 | 0 | PRIMARY | 1 | formDataId | A | 320942 | NULL | NULL | | BTREE | | |
| formData2 | 1 | f11 | 1 | f11 | A | 6 | NULL | NULL | YES | BTREE | | |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)


So, I decided to check what would happen if I changed those NULLs by a particular value, let's say -1:

UPDATE formData2 SET f11 = -1 WHERE f11 IS NULL;

And then recalculating stats:

ANALYZE TABLE formData2;

This is the new cardinality situation:
SHOW INDEX FROM formData2;
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| formData2 | 0 | PRIMARY | 1 | formDataId | A | 320641 | NULL | NULL | | BTREE | | |
| formData2 | 1 | f11 | 1 | f11 | A | 7 | NULL | NULL | YES | BTREE | | |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.01 sec)

It didn't change that much.

But, when I re-ran the query changing the condition (f11 IS NULL) accordingly (f11 = -1):

SELECT COUNT(*)
FROM
rContactFormData AS rContactFormData2
LEFT JOIN formData2 ON (rContactFormData2.formDataId = formData2.formDataId)
LEFT JOIN notReplicableMultiSelect3 ON (notReplicableMultiSelect3.formDataId = formData2.formDataId)
WHERE
notReplicableMultiSelect3.formDataId IS NOT NULL AND
formData2.formDataId IS NOT NULL AND
formData2.f11 = -1
;

+----------+
| COUNT(*) |
+----------+
| 3540 |
+----------+
1 row in set (0.01 sec)

It is quite faster, almost instantaneous.
This is the new plan I get when using EXPLAIN:
+----+-------------+---------------------------+--------+---------------+------------+---------+--------------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------------+--------+---------------+------------+---------+--------------------------------------------+------+-------------+
| 1 | SIMPLE | notReplicableMultiSelect3 | ALL | idKey | NULL | NULL | NULL | 3581 | Using where |
| 1 | SIMPLE | formData2 | eq_ref | PRIMARY,f11 | PRIMARY | 4 | apria.notReplicableMultiSelect3.formDataId | 1 | Using where |
| 1 | SIMPLE | rContactFormData2 | ref | formDataId | formDataId | 4 | apria.notReplicableMultiSelect3.formDataId | 1 | Using index |
+----+-------------+---------------------------+--------+---------------+------------+---------+--------------------------------------------+------+-------------+

Which shows that now the planner is not using the index f11 but a full scan on notReplicableMultiSelect3 table with 3581 rows. This is obviously the best plan in this case considering that the 10 rows estimation using formData2.f11 column/index is not real for NULL values and it access a huge number of rows.
The question is: why the planner behaves and decides differently if the column contains majority of NULLs than it does when containing -1? I checked the variable innodb_stats_method which I supposed could have been the answer:
mysql> show variables like 'innodb_stats_method';
+---------------------+-------------+
| Variable_name | Value |
+---------------------+-------------+
| innodb_stats_method | nulls_equal |
+---------------------+-------------+
According to mysql documentation:
"For nulls_equal, all NULL index values are considered equal and form a single value group that has a size equal to the number of NULL values."
Such behavior should make the server treat NULLs values as any other value when collecting statistics. Since the cardinality remained almost the same I guess this is really happening but apparently it does not detect the bad distribution, I mean, the majority of NULLs, when deciding the plan.

Finally, let me share an additional observation: as you might have already noticed, due to the dynamic way we generate the query string, we use "LEFT JOIN + IS NOT NULL" logic instead of using directly INNER JOIN which would be semantically the same thing. But It seems not to be the same for the planner which in this case solves the query with a different (and convenient) plan:

EXPLAIN
SELECT COUNT(*)
FROM
rContactFormData AS rContactFormData2
JOIN formData2 ON (rContactFormData2.formDataId = formData2.formDataId)
JOIN notReplicableMultiSelect3 ON (notReplicableMultiSelect3.formDataId = formData2.formDataId)
WHERE
formData2.f11 IS NULL;

+----+-------------+---------------------------+--------+---------------+------------+---------+--------------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------------+--------+---------------+------------+---------+--------------------------------------------+------+-------------+
| 1 | SIMPLE | notReplicableMultiSelect3 | ALL | idKey | NULL | NULL | NULL | 3581 | |
| 1 | SIMPLE | formData2 | eq_ref | PRIMARY,f11 | PRIMARY | 4 | apria.notReplicableMultiSelect3.formDataId | 1 | Using where |
| 1 | SIMPLE | rContactFormData2 | ref | formDataId | formDataId | 4 | apria.notReplicableMultiSelect3.formDataId | 1 | Using index |
+----+-------------+---------------------------+--------+---------------+------------+---------+--------------------------------------------+------+-------------+
3 rows in set (0.00 sec)

And it actually runs quickly:
+----------+
| COUNT(*) |
+----------+
| 3540 |
+----------+
1 row in set (0.01 sec)

But that's not quite enough for me as a solution for 2 reasons:
1 - I would like to understand how the planner works for future situations.
2 - I would need, as DBA, to ask the developers to change the code so we start using JOIN instead of the current syntaxis. Such a change might be hard to be done, so, I would need solid arguments to explain why MySql does not behave the same way in both situations.

What do you think?
Thanks in advance.



Edited 1 time(s). Last edit at 05/02/2012 04:04PM by Guillermo Schulman.

Options: ReplyQuote


Subject
Views
Written By
Posted
Planner taking a bad decission when having majority of NULLs
1986
May 02, 2012 10:16AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.