Distinct/group by, order by and filesort
Posted by: Tolan Blundell
Date: February 23, 2010 01:20PM
Date: February 23, 2010 01:20PM
I've Googled and tested this one until my eyes bleed but I really can't find a solution.
A quick overview. I have an O/RM that uses left outer joins for mapping object inheritance, and can use either joins or subselects for looking up object relationships.
When using joins for object relationships I need to use distinct or order by to limit the results to one row per selected object.
So here's a sample (fetch all articles and subclasses which are published and latest version and attached to 'section' 2):
------------------------------------------------------------
------------------------------------------------------------
Here's the output from mk-visual-explain:
------------------------------------------------------------
------------------------------------------------------------
So... I simply can't avoid a filesort with the DISTINCT() present. I've also tried replacing the DISTINCT() with "GROUP BY article.article_date DESC,article.id DESC" (so it matches the order by) but again no dice. I have a huge number of test indexes (obviously this isn't a production box ;):
------------------------------------------------------------
Any thoughts? I'm about at my wits end and ready to go back to subselects. It's a shame because generally the system performs better with the joins but there are a lot of queries of this class doing big filesorts and wiping out any performance gains elsewhere...
Thanks!
Edited 2 time(s). Last edit at 02/23/2010 02:58PM by Tolan Blundell.
A quick overview. I have an O/RM that uses left outer joins for mapping object inheritance, and can use either joins or subselects for looking up object relationships.
When using joins for object relationships I need to use distinct or order by to limit the results to one row per selected object.
So here's a sample (fetch all articles and subclasses which are published and latest version and attached to 'section' 2):
------------------------------------------------------------
SELECT DISTINCT( article.id ) as TD, article.*, job.*, press_release.*, job.* FROM article LEFT OUTER JOIN press_release ON ( article.id=press_release.id AND article.TORM_REVISION_ID=press_release.TORM_REVISION_ID ) LEFT OUTER JOIN job ON ( article.id=job.id AND article.TORM_REVISION_ID=job.TORM_REVISION_ID ) LEFT OUTER JOIN section_to_article ON ( article.id=section_to_article.article_id ) AND (section_to_article.section_id = 2) WHERE (article.published = 1) AND ( article.TORM_IS_CURRENT_REVISION = 1 ) ORDER BY article.article_date DESC,article.id DESC LIMIT 50;
------------------------------------------------------------
Here's the output from mk-visual-explain:
------------------------------------------------------------
Filesort +- TEMPORARY table temporary(article,press_release,job,section_to_article) +- JOIN +- Distinct/Not-Exists | +- Bookmark lookup | +- Table | | table section_to_article | | possible_keys idx_section_to_art_art_id,idx_section_to_art_sec_id | +- Index lookup | key section_to_article->idx_section_to_art_art_id | possible_keys idx_section_to_art_art_id,idx_section_to_art_sec_id | key_len 5 | ref my_db.article.id | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table job | | possible_keys job_composite | +- Index lookup | key job->job_composite | possible_keys job_composite | key_len 9 | ref my_db.article.id,my_db.article.TORM_REVISION_ID | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table press_release | | possible_keys pr_revid,pr_composite | +- Index lookup | key press_release->pr_composite | possible_keys pr_revid,pr_composite | key_len 9 | ref my_db.article.id,my_db.article.TORM_REVISION_ID | rows 1 +- Filter with WHERE +- Bookmark lookup +- Table | table article | possible_keys article_published,art_is_current_rev,art_comp_x3,art_comp_x4 +- Index lookup key article->art_is_current_rev possible_keys article_published,art_is_current_rev,art_comp_x3,art_comp_x4 key_len 2 ref const rows 10257
------------------------------------------------------------
So... I simply can't avoid a filesort with the DISTINCT() present. I've also tried replacing the DISTINCT() with "GROUP BY article.article_date DESC,article.id DESC" (so it matches the order by) but again no dice. I have a huge number of test indexes (obviously this isn't a production box ;):
------------------------------------------------------------
+---------+------------+--------------------+--------------+--------------------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +---------+------------+--------------------+--------------+--------------------------+-----------+-------------+----------+--------+------+------------+---------+ | article | 1 | torm_class_article | 1 | TORM_CLASS | A | 4 | NULL | NULL | YES | BTREE | | | article | 1 | article_published | 1 | published | A | 2 | NULL | NULL | YES | BTREE | | | article | 1 | article_date | 1 | article_date | A | 3726 | NULL | NULL | YES | BTREE | | | article | 1 | idx_article_dirty | 1 | dirty | A | 3 | NULL | NULL | YES | BTREE | | | article | 1 | article_composite | 1 | id | A | NULL | NULL | NULL | | BTREE | | | article | 1 | article_composite | 2 | TORM_REVISION_ID | A | NULL | NULL | NULL | YES | BTREE | | | article | 1 | article_composite2 | 1 | id | A | 7452 | NULL | NULL | | BTREE | | | article | 1 | article_composite2 | 2 | TORM_REVISION_ID | A | 22356 | NULL | NULL | YES | BTREE | | | article | 1 | article_composite2 | 3 | TORM_CLASS | A | 22356 | NULL | NULL | YES | BTREE | | | article | 1 | article_composite2 | 4 | TORM_IS_CURRENT_REVISION | A | 22356 | NULL | NULL | YES | BTREE | | | article | 1 | article_composite2 | 5 | published | A | 22356 | NULL | NULL | YES | BTREE | | | article | 1 | art_is_current_rev | 1 | TORM_IS_CURRENT_REVISION | A | 6 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x2 | 1 | TORM_CLASS | A | 4 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x2 | 2 | TORM_IS_CURRENT_REVISION | A | 12 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x2 | 3 | article_date | A | 4471 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x2 | 4 | id | A | 11178 | NULL | NULL | | BTREE | | | article | 1 | art_comp_x3 | 1 | published | A | 2 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x3 | 2 | TORM_CLASS | A | 6 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x3 | 3 | TORM_IS_CURRENT_REVISION | A | 18 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x3 | 4 | article_date | A | 5589 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x3 | 5 | id | A | 11178 | NULL | NULL | | BTREE | | | article | 1 | art_comp_x4 | 1 | published | A | 2 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x4 | 2 | TORM_CLASS | A | 6 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x4 | 3 | TORM_IS_CURRENT_REVISION | A | 18 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x4 | 4 | article_date | A | 5589 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x4 | 5 | id | A | 11178 | NULL | NULL | | BTREE | | | article | 1 | art_comp_x5 | 1 | TORM_CLASS | A | 4 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x5 | 2 | TORM_IS_CURRENT_REVISION | A | 12 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x5 | 3 | article_date | A | 4471 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x5 | 4 | id | A | 11178 | NULL | NULL | | BTREE | | | article | 1 | art_id | 1 | id | A | 7452 | NULL | NULL | | BTREE | | | article | 1 | art_comp_x6 | 1 | id | A | 7452 | NULL | NULL | | BTREE | | | article | 1 | art_comp_x6 | 2 | published | A | 11178 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x6 | 3 | TORM_CLASS | A | 11178 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x6 | 4 | TORM_IS_CURRENT_REVISION | A | 11178 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x7 | 1 | TORM_CLASS | A | 4 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x7 | 2 | TORM_IS_CURRENT_REVISION | A | 12 | NULL | NULL | YES | BTREE | | | article | 1 | art_comp_x7 | 3 | id | A | 11178 | NULL | NULL | | BTREE | | | article | 1 | art_search_head | 1 | headline | NULL | 11178 | NULL | NULL | YES | FULLTEXT | | | article | 1 | art_search_summ | 1 | summary | NULL | 11178 | NULL | NULL | YES | FULLTEXT | | | article | 1 | art_search_body | 1 | body | NULL | 22356 | NULL | NULL | YES | FULLTEXT | | +---------+------------+--------------------+--------------+--------------------------+-----------+-------------+----------+--------+------+------------+---------+
Any thoughts? I'm about at my wits end and ready to go back to subselects. It's a shame because generally the system performs better with the joins but there are a lot of queries of this class doing big filesorts and wiping out any performance gains elsewhere...
Thanks!
Edited 2 time(s). Last edit at 02/23/2010 02:58PM by Tolan Blundell.
Subject
Views
Written By
Posted
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.