MySQL Forums
Forum List  »  Full-Text Search

Re: Fulltext and order by date... any way to optimize?
Posted by: Apachez
Date: August 22, 2006 03:21AM

I noticed the same behavior in my forum where I previously was using mysql fulltext as "search engine".

The problem is even more visible when you search at words which are common among your rows since you are also using boolean mode just like I did.

For example you have indexed 1 million or more rows (so that using a LIKE search will be highly inefficient). You then perform a search with two or more words which at the same time are very common in your dataset. For example word1 itself has about 700k matches and word2 itself say have about 600k matches.

Mysql will then create a temp file, fill that with all 700k + 600k matches, group them on id, sort them on date and THEN return say the top 25 matches. The execution time will of course be worser with the amount of common words you add to your search query.

In my case on a dual p2 333MHz system this ended up taking more than 10 minutes for such a query and the client itself gets frustrated when they dont get any answer within approx 5 seconds, so they fired the same query one more time making my poor server having to perform two such queries which ended up in a DoS situation :P

I never managed to find any working solution to workaround mysql's behavior for this so I ended up writing my own fulltext search engine which is available at:

http://www.tbg.nu/tbgsearch/tbgsearch.zip (package with readme and source code)
http://www.tbg.nu/tbgsearch/readme.txt (readme)
http://www.tbg.nu/tbgsearch (html page with a better description on how this engine works, however it is not yet in sync with how the engine currently looks (some features has been added since I first wrote that html))



Edited 1 time(s). Last edit at 08/22/2006 03:22AM by Apachez .

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Fulltext and order by date... any way to optimize?
4005
August 22, 2006 03:21AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.