MySQL Forums
Forum List  »  Performance

Re: Planner taking a bad decission when having majority of NULLs
Posted by: Rick James
Date: May 03, 2012 08:27AM

> It is quite faster, almost instantaneous.
Do not be deceived. If you started "cold", the first query may have spent most of its time loading blocks from disk. The second query needed the same blocks, but they were already in ram. Always run timing tests twice.

These would have helped me be more definitive:
SHOW CREATE TABLE -- engine, datatypes, etc
SHOW TABLE STATUS -- engine, table sizes
On a typical box, 0.80s = 80 disk hits.

Running twice can lead to another deception. The Query cache could be doing the work "instantaneously" instead of the actual query. Make sure the Query cache is not involved. Recommend "SELECT SQL_NO_CACHE ..." for all timing tests.
0.01s could be "found query in Query cache, so did not have to execute it".

Since the query plan decided to start with formData2, which is after LEFT JOIN, the "LEFT" is irrelevant.
> we use "LEFT JOIN + IS NOT NULL"
You did not really do that. To have LEFT JOIN kick in, you need to be testing the value in the ON.

> And then recalculating stats
Even after recalculating, the stats will be only approximate. You have hinted that you are using InnoDB; if so, even ANALYZE leads to some randomness.

> This is obviously the best plan in this case considering that the 10 rows estimation using formData2.f11 column/index is not real for NULL values and it access a huge number of rows.
That may be an artifact of the randomness of the stats.

> why the planner behaves and decides differently if the column contains majority of NULLs than it does when containing -1?
While your conclusion may be correct, I'm not convinced that it is. Could you update back to NULL and test the first case again? (Even then I would not be convinced.)

To force the optimizer to hit the tables in a particular order, use STRAIGHT_JOIN. (This is not recommended, because you could be forcing it to pick the 'wrong' query plan for a different set of constants.)

> 1 - I would like to understand how the planner works for future situations.
I've spent years trying to 'understand' it. Still, new situations (such as yours) provide me with new puzzles, and new insight into what the planner is thinking. I think I understand the planner well enough to be correct most of the time on this forum.

> 2 - I would need, as DBA, to ask the developers to change the code so we start using JOIN instead of the current syntaxis. Such a change might be hard to be done, so, I would need solid arguments to explain why MySql does not behave the same way in both situations.
I can't give you solid arguments, because of flaws in your experiments (QC, time twice, flip back, etc); and because the next case will be 'different'.

> I changed it with a COUNT(*) to make it...
Caution: The query plan can change depending on what columns are used in the SELECT part -- you might be "Using index" erroneously. In particular, your first EXPLAIN may not match what the 'real' SELECT uses!

I believe I can say "don't use LEFT unless it applies". I say this, not from a performance point of view, but because it can mislead. Do this:
EXPLAIN EXTENDED SELECT ... LEFT ...;
SHOW WARNINGS;
You will (I think) see that the first LEFT vanished totally. This says that the optimizer realized that it was irrelevant. (Ditto for INNER and OUTER.)

The second LEFT...
LEFT JOIN notReplicableMultiSelect3 AS n ON (n.formDataId = f.formDataId)
WHERE n.formDataId IS NOT NULL
Seems useless. That "IS NOT NULL" is effectively negating "LEFT". If you get rid of "LEFT", that part of the WHERE clause can also be removed. Perhaps you can rewrite the whole query to show them how much simpler it will be; maybe they will like that.

Recap.
* The original cardinality of 10 was blatantly wrong, probably because quirks in InnoDB. This is essentially unsolvable.
* The timings are suspect because of two caching issues.
* "LEFT" is a red herring.
* Get more info via EXPLAIN EXTENDED + SHOW WARNINGS

More tips for you and your devs:
http://mysql.rjweb.org/doc.php/ricksrots
Your case is so esoteric that only one or two tips apply.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Planner taking a bad decission when having majority of NULLs
1213
May 03, 2012 08:27AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.