Re: Can we expect an enhancement for derived table merge optimization?
Posted by:
Roy Lyseng
Date: January 14, 2013 08:43AM
Rick James Wrote:
-------------------------------------------------------
> Roy, thanks for the insights. I am an outsider
> looking in. I have been using MySQL since 3.53 in
> 1999. I try to keep up with all the subtle
> issues, but we are getting into some details I am
> weak on. What I know comes from conferences; I
> have avoided VIEWs because of their limitations.
> It has always felt like VIEWs were implemented
> hastily and that the Optimizer was not involved
> enough in the implementation. (As you say: "that
> derived tables and view management were simply
> developed in two separate stages".)
I think that views are pretty well integrated into the optimizer code by now. But there will be more consolidation with the derived tables handing going forward.
If you see deviations between handling of views and base tables, please file bug reports for them. It could be very useful for us going forward, in order to get more robust handling of views in the optimizer code.
>
> > We are working on a solution were we merge
> simple derived tables into the outer query block
> Good to hear.
>
> > Case 2 ... are able to create an index on the
> derived table
>
> Roy, Is that 5.6.7? Or earlier??
I think it was earlier, in 5.6.3.
>
> > Case 5 ... a) If the subquery is non-aggregated,
> it is converted to a semi-join. See below.
>
> If the subquery is complex (eg, including JOINs,
> etc); "a)" could be significantly slower than
> "b)". OTOH, if the _outer_ query has some form of
> LIMIT, then evaluating "a)" a few times could be
> better than a big materialized subquery, plus
> indexing. (Perhaps you cover this debate in "Case
> 6")
>
In principle, solution a) should evaluate all possible strategies, including regular nested-loops join, join buffering and materialization, and choose the least costly plan.
The LIMIT clause may cause problems for the optimizer, however. For a nested-loops execution, it should not mean very much, but for a materialization or Block Nested Loop join buffering, an operation that may be relatively cheap compared to the nested-loops part without LIMIT, may become overwhelmingly expensive with a small LIMIT. In such cases, you may have to "help" the optimizer by disabling join buffering or materialization.
> Roy, these seem to be contradictory:
> > MySQL 5.6 does not include view merging.
I am sorry. I meant to write "derived table merging".
> > I am not aware of any principal differences
> between views and regular tables
> (Granted, it is a stretch to say those are even
> related.)
> * If a VIEW has a WHERE clause and the SELECT
> using the VIEW also has a WHERE clause, the WHEREs
> are not combined before optimizing, correct?
> (That's what you mean by "merging"?)
> * A "regular" table (or more precisely a regular
> SELECT) would look at the entire WHERE, thereby
> having more optimization opportunities.
>
> I see 'merging' as the main step toward making
> VIEWs first-class, competitive, constructs.
The WHERE conditions are merged in an early stage of the optimizer, and the combined condition is then used as the WHERE clause throughout the rest of the optimization process. I think that JOIN conditions are also merged correctly, both in the case where a view itself is a JOIN, and when a JOIN combines base tables and views.
There are of course problems, such as the inability to create a view that includes a derived table in the FROM clause, and some wrong results with outer joins combined with views in a few circumstances.