MySQL Forums
Forum List  »  Optimizer & Parser

Re: Stop using filesort
Posted by: Jørgen Løland
Date: May 08, 2011 01:28PM

Hi Mark,

Some query clauses implicitly requires MySQL to handle rows in a certain order. In your query, MySQL needs ordered rows to resolve both the GROUP BY and ORDER BY clauses.

To get ordered rows, MySQL basically has two alternatives:
* read the rows through an index that provides the interesting order
* read the rows through any other access method and then do filesort

In many cases it is possible to create a composite (multi-column) index that provides the interesting order, but in this query that's not an option. The reason is that ORDER BY is performed on a result set field composed of columns from two different tables, and it's not possible to create a composite index using columns from two different tables. Thus, with this database schema your query has to do filesort. I advise you to store both search columns in one table and make a composite fulltext index for them.

I also notice that you use LEFT JOIN, which makes it difficult for MySQL to chose the optimal join order. Is it possible in your case to have rows in post_search without a join match in topic_search? If not, you should skip "LEFT". Also note that if there is a post_search row without a join match in topic_search, your query will output NULL NULL NULL... for that row since you select columns from topic_search.

That being said, I'm not sure this query does what you want it to do. I'm imagining this query is used to perform searches in an discussion forum like forums.mysql.com. You probably want the search to show only one row for each discussion thread (hence the GROUP BY clause), and the thread with the highest searchterm match displayed first.

Here's the problem: You select non-aggregated columns in a grouped query [1]. In the SQL standard, this is not legal and there are some good reasons for it. Consider this:

a  b
----
1  1
1  2
1  3

SELECT a, b FROM table GROUP BY a;

MySQL will make on group of all three records. Since b is not an aggregated column, MySQL is free to return one row with any of the three b values. Thus (1,1), (1,2) or (1,3) are all correct results. Granted, MySQL will most likely return (1,1) in this case, but that is not guaranteed.

In your query, you use a non-aggregated column ("score"), which is not in the GROUP BY clause, for ordering. This means that MySQL will pick one random [2] score for each group and use that value when ordering the resulting rows. Consider this:

a  b
----
1  1
1  3
2  2
2  1

SELECT a, b FROM table GROUP BY a ORDER BY b DESC;

You probably expect MySQL to return (1,3), (2,2), but the actual result is (2,2), (1,1). This is correct, and (1,3), (2,2) would also be correct. Be careful when you run aggregate queries with non-aggregated columns...

Btw, what you need to do to get it correct is to use the aggregate function "MAX(MATCH...) as score".


[1] http://dev.mysql.com/doc/refman/5.5/en/group-by-hidden-columns.html
[2] OK, it's not random. It will be the score value of the first row read from that group, but as I said nothing is guaranteed.

Hope this helps,
Jørgen Løland
Software Engineer, MySQL, Oracle
jorgenloland.blogspot.com

Options: ReplyQuote


Subject
Views
Written By
Posted
5759
May 06, 2011 02:14PM
Re: Stop using filesort
2484
May 08, 2011 01:28PM
1753
May 09, 2011 10:57AM
3636
May 10, 2011 04:40AM
1695
May 10, 2011 11:41AM
2600
May 10, 2011 12:54PM
1795
May 12, 2011 08:04AM
1383
May 12, 2011 08:52AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.