MySQL Forums
Forum List  »  Optimizer & Parser

Optimizer produces incorrect resultset
Posted by: J Barchan
Date: September 27, 2018 09:35AM

I have joined this forum to ask an advanced question about MySQL because I don't think I will get a good enough answer on stackoverflow.

I do not know what the form/response is like here, so I'm going to keep this very brief and see.

The full detail of the question is at https://stackoverflow.com/questions/52535846/mysql-left-join-with-where-function-call-produces-wrong-result/52537271#52537271

The brief "conceptual" summary is:

* You have a query with a LEFT JOIN and a WHERE clause which calls a user function (NOT DETERMINISTIC, just in case) which takes parameters from the (left and) right table:

SELECT * FROM left
LEFT JOIN right ON right.something = left.something
WHERE user_function(left.col1, right.col2)

* The user_function is indeed expected to be called on each row, producing varying results for different parameters. I am aware of the inefficiency, that is not the point.

* The user_function *does* produce something for the result when its 2md parameter (right.col2) is NULL, which happens as a result of the LEFT JOIN not finding a match.

* The resultset is **WRONG** on rows where there was no match for the "ON" and it generated "dummy NULLs" for the right columns.

All this works fine under, say, MS SQL Server.

However under MySQL (I use 5.7) what essentially happens is that the Optimizer decides that because right.col2 was only produced by the NULL-LEFT-JOIN mechanism it will not even call the function, it will simply decide that the function could never return anything useful and such a row simply cannot match!

This has been verified by looking at the "EXPLAIN" plan for the query.

If you test by rewriting the query to have an explicit NULL:

WHERE user_function(left.col1, NULL)

instead of:

WHERE user_function(left.col1, right.col2)

it *does* work on those rows which would have had right.col2 being NULL, even though that's the same value! So it is the fact of passing such a column from the right simply as a parameter to a user function that the Optimizer is picking up on and deciding it knows that it does not want the row in the resultset without any call of the function which is returning something quite different.

So.... I have heard of optimizers being slower or faster, but not that they can cause the wrong results to be returned! I need some "rules" about what the Optimizer is going to do if I am to understand when I can and cannot safely call such a function in this case? Else I am real trouble....

Can any of you experts shed some light on this, kindly? I'd be really grateful.

Options: ReplyQuote


Subject
Views
Written By
Posted
Optimizer produces incorrect resultset
784
September 27, 2018 09:35AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.