MySQL Forums
Forum List  »  Performance

Re: Planner taking a bad decission when having majority of NULLs
Posted by: Guillermo Schulman
Date: May 07, 2012 08:38AM

Rick James Wrote:
-------------------------------------------------------
> > It is quite faster, almost instantaneous.
> Do not be deceived. If you started "cold", the
> first query may have spent most of its time
> loading blocks from disk. The second query needed
> the same blocks, but they were already in ram.
> Always run timing tests twice.

Hi Rick, first of all, thanks a lot for answering.
I ran it several times and the times are always almost the same. I discarded the idea of blaming any kind of cache (at DB or OS level) not only because I ran it several times with identical results, but also because I had used SQL_NO_CACHE, and mainly because the central point of my question is that the planner is actually using a different plan for both situations: NULL values and -1 values with identical distributions. I mean, they perform differently because of their different plans.

>
> These would have helped me be more definitive:
> SHOW CREATE TABLE -- engine, datatypes, etc
> SHOW TABLE STATUS -- engine, table sizes
> On a typical box, 0.80s = 80 disk hits.

Here are the CREATE TABLE commands for the involved tables:

CREATE TABLE `rContactFormData` (
`contactId` int(10) unsigned NOT NULL,
`formDataId` int(10) unsigned NOT NULL,
PRIMARY KEY (`contactId`,`formDataId`),
KEY `formDataId` (`formDataId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ;

CREATE TABLE `formData2` (
`formDataId` int(10) unsigned NOT NULL,
`f8` varchar(255) DEFAULT NULL,
`f11` int(10) DEFAULT NULL,
`f13` int(10) DEFAULT NULL,
PRIMARY KEY (`formDataId`),
KEY `f11` (`f11`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ;

CREATE TABLE `notReplicableMultiSelect3` (
`formDataId` int(10) unsigned NOT NULL,
KEY `idKey` (`formDataId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ;

And the TABLE STATUS for each:

mysql> show table status like 'rContactFormData'\G
*************************** 1. row ***************************
Name: rContactFormData
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 477577
Avg_row_length: 31
Data_length: 15220736
Max_data_length: 0
Index_length: 6832128
Data_free: 39845888
Auto_increment: NULL
Create_time: 2012-04-26 15:21:15
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment:

mysql> show table status like 'formData2'\G
*************************** 1. row ***************************
Name: formData2
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 320791
Avg_row_length: 31
Data_length: 9977856
Max_data_length: 0
Index_length: 7372800
Data_free: 39845888
Auto_increment: NULL
Create_time: 2012-04-26 21:29:18
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment:


mysql> show table status like 'notReplicableMultiSelect3'\G
*************************** 1. row ***************************
Name: notReplicableMultiSelect3
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 3425
Avg_row_length: 43
Data_length: 147456
Max_data_length: 0
Index_length: 81920
Data_free: 39845888
Auto_increment: NULL
Create_time: 2012-05-07 01:57:17
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment:

>
> > And then recalculating stats
> Even after recalculating, the stats will be only
> approximate. You have hinted that you are using
> InnoDB; if so, even ANALYZE leads to some
> randomness.

I know the stats are approximations, I'm just running analyze to make sure the planner works with updated stats (regardless the accuracy), to make sure that after the UDPATE (from NULLs to -1s) the cardinality is not changed. And it is not, so that does not seem to be the problem.

>
> > This is obviously the best plan in this case
> > considering that the 10 rows estimation using
> > formData2.f11 column/index is not real for NULL
> > values and it access a huge number of rows.
> That may be an artifact of the randomness of the
> stats.

Ok, but I tested it several times going back and forward with NULLs and -1s values and it always worked in one way (the correct one) for -1s and another way (the wrong one) for NULLs. That's not randomness effect.

>
> > why the planner behaves and decides differently
> > if the column contains majority of NULLs than it
> > does when containing -1?
> While your conclusion may be correct, I'm not
> convinced that it is. Could you update back to
> NULL and test the first case again? (Even then I
> would not be convinced.)

Yes, I did it several times, it always behaves the same: slow for NULLs, quick for -1s.

>
> To force the optimizer to hit the tables in a
> particular order, use STRAIGHT_JOIN. (This is not
> recommended, because you could be forcing it to
> pick the 'wrong' query plan for a different set of
> constants.)

I know. Actually, I can make the planner to choose the right plan by simply making it to ignore the index that is confusing here: IGNORE INDEX (f11) just works perfectly. But that's not a solution for me: the queries are generated dynamically with different sets of data and I should know what's the best idea in each case.

>
> > 1 - I would like to understand how the planner
> > works for future situations.
> I've spent years trying to 'understand' it.

LOL. Let's rephrase "I would like to understand a bit better how the planner works everyday".

> Still, new situations (such as yours) provide me
> with new puzzles, and new insight into what the
> planner is thinking. I think I understand the
> planner well enough to be correct most of the time
> on this forum.
>
> > 2 - I would need, as DBA, to ask the developers
> > to change the code so we start using JOIN instead
> > of the current syntaxis. Such a change might be
> > hard to be done, so, I would need solid arguments
> > to explain why MySql does not behave the same way
> > in both situations.
> I can't give you solid arguments, because of flaws
> in your experiments (QC, time twice, flip back,
> etc); and because the next case will be
> 'different'.

Please, let me know if I can help you with further information. I could even upload a dump to reproduce the situation if that helped you.

>
> > I changed it with a COUNT(*) to make it...
> Caution: The query plan can change depending on
> what columns are used in the SELECT part -- you
> might be "Using index" erroneously. In
> particular, your first EXPLAIN may not match what
> the 'real' SELECT uses!

You are right, but I doublechcecked it: in this case the plans and the performance are similar using different columns and using COUNT.

>
> I believe I can say "don't use LEFT unless it
> applies". I say this, not from a performance
> point of view, but because it can mislead. Do
> this:
> EXPLAIN EXTENDED SELECT ... LEFT ...;
> SHOW WARNINGS;
> You will (I think) see that the first LEFT
> vanished totally. This says that the optimizer
> realized that it was irrelevant. (Ditto for INNER
> and OUTER.)


Well, here comes another really weird thing: when I run EXPLAIN EXTENDED + SHOW WARNINGS it show me the following query which is supposed to be the one being "internally" run:

select sql_no_cache count(0) AS `COUNT(*)`
from `dbtest`.`rContactFormData` `rContactFormData2`
join `dbtest`.`formData2`
join `dbtest`.`notReplicableMultiSelect3`
where
(
(`dbtest`.`notReplicableMultiSelect3`.`formDataId` = `dbtest`.`formData2`.`formDataId`) and
(`dbtest`.`rContactFormData2`.`formDataId` = `dbtest`.`formData2`.`formDataId`) and
(`dbtest`.`formData2`.`formDataId` is not null) and
(`dbtest`.`formData2`.`formDataId` is not null) and
isnull(`dbtest`.`formData2`.`f11`)
)

Just for curiosity's sake, I tested what would happen if I ran that one, I mean, the one that SHOW WARNINGS shows. Surprisingly, it behaves in the "right" way, meaning it does not use the annoying index and it runs fast. So, why doesn't it works that way in the original run? Isn't it really weird?
And still more: I iterated again, I mean, ran EXPLAIN EXTENDED + SHOW WARNINGS for the new query and I got a query which was directly not equivalent, lacking the JOINinig conditions and, thus, taking for ever to run. Would it be a bug or is it an expected behaviour?

> The second LEFT...
> LEFT JOIN notReplicableMultiSelect3 AS n ON
> (n.formDataId = f.formDataId)
> WHERE n.formDataId IS NOT NULL
> Seems useless. That "IS NOT NULL" is effectively
> negating "LEFT". If you get rid of "LEFT", that
> part of the WHERE clause can also be removed.
> Perhaps you can rewrite the whole query to show
> them how much simpler it will be; maybe they will
> like that.

I do agree, I'm advocating for this change. It's not as simply as it would look: the backend code is a query generator behind a really flexible interface to request any kind of list from our data. It works succesfully and performs ok in 99% of the cases. In any case, even if I convince them to make this change, I would like to learn about this beahviour

>
> Recap.
> * The original cardinality of 10 was blatantly
> wrong, probably because quirks in InnoDB. This is
> essentially unsolvable.
> * The timings are suspect because of two caching
> issues.
> * "LEFT" is a red herring.
> * Get more info via EXPLAIN EXTENDED + SHOW
> WARNINGS
>
> More tips for you and your devs:
> http://mysql.rjweb.org/doc.php/ricksrots
> Your case is so esoteric that only one or two tips
> apply.

Thanks again!

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Planner taking a bad decission when having majority of NULLs
1319
May 07, 2012 08:38AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.