Re: Optimize subquery, replace IN() with JOIN
Hi,
EXPLAIN shows that the optimizer has chosen the so-called MatLookup strategy for the semi-join execution. That is, it will store result of joining the tables of the IN-subqueries in a temporary table. That is, join category_content_se with itself. During this process duplicates will be removed. It then goes through the content table using the published index, and for each row that match the where predicate, it will do a look-up into the temporary table to see if there is a match on ID.
It might be that there are more optimal semijoin strategies. You could try to see what happens if you do "SET optimizer_switch='materialization=off'".
However, I am not convinced that it is the subqueries that give you problems. I rather suspect that the reason is that MySQL will generate all potential rows, then sort them, and pick the 10 newest. If you were able to process the rows in descending date order, you will be able to stop when 10 rows have been found. To do that you will need an index on date_add column. Then best would be to create a compound index on (published, typ, date_add). Then only the interesting rows of content table need to be access.
(I assume that the EXPLAIN output is from an example database. I have a hard time understanding how this query could take 5 minutes with the row numbers presented.)
Øystein Grøvlen,
Senior Principal Software Engineer,
MySQL Group, Oracle,
Trondheim, Norway
Subject
Views
Written By
Posted
2544
November 15, 2016 02:37PM
1029
November 15, 2016 03:34PM
1188
November 15, 2016 04:25PM
1028
November 17, 2016 08:11PM
983
November 22, 2016 02:44PM
1021
November 16, 2016 04:03AM
1029
November 16, 2016 12:51PM
1168
November 16, 2016 03:22PM
973
November 16, 2016 04:44PM
785
December 27, 2016 05:32AM
895
November 16, 2016 01:50AM
941
November 16, 2016 04:01AM
Re: Optimize subquery, replace IN() with JOIN
1095
November 16, 2016 06:23AM