MySQL Forums
Forum List  »  Performance

Optimize query and/or db structure, FullText search + sort by other fields
Posted by: Aleksandr Guidrevitch
Date: November 05, 2004 06:38AM

Hi there !

My objective is to implement quick, really quick complex fulltext search with 'order by' (< 2 seconds).
The actual table I'd like to search is `lot`. I've created 2 helper tables : 1. `search`, which contains normalized (or stemmed) title and description with fulltext search index 2. `category_map` which covers all relations of categories, and is especially useful when I need to perform search on all children categories of current category.

The total count of records in lot, and correspondingly in `search`, tables is expected to be 50 000, each around 2048 (stemmed description) + 50 (stemmed title) bytes.

I need 1) fulltextsearch within all records, 2) category listing, 3) fulltextsearch within a category.
Also, the result set should be *always* ordered either by lot.end_time, lot.current_price * exchange_rate, lot.title or lot.bid_count. The fulltextsearch relevance is not important, I'm ready to ignore it in favor of performance.

Some time ago I've tried to put fulltext right inside the `lot` table, but it seemed to produce much longer lasting queries than the current, in plus in this case i was unable to LOAD INDEX INTO CACHE because of different key length of FULLTEXT and other table's indexes. That's why I've separated fields which I'd like to index fulltext into `search` table. But now I'm stuck with performance degradation of ORDER BY. I've tried to play with the related sort_buffer_size, myisam_sort_buffer_size and other params described in the docs related ORDER BY and filesort, but with no significant performance influence. I do not mind to create as many helper tables as needed, but I cant figure out what else can I improve. Another disappointing problem is that it is rather slow to generate 50 000 records for different table structures (around 2 hours) just to play with them for 10-20 mins.

Here is what I came up with at the moment:

SELECT lot.id, search.title FROM search LEFT JOIN lot ON lot.id = search.lot_id, category_map, exchange_rate WHERE MATCH(search.title, search.description) AGAINST ('some query' IN BOOLEAN MODE) AND category_map.child_id = lot.category_id AND category_map.parent_id = 10 AND exchange_rate.currency_id = lot.currency_id ORDER BY lot.current_price * exchange_rate.exchange_rate LIMIT 0, 50

Which produces almost acceptable results (which I'd like to improve further) if I omit ORDER BY statement. I do understand that the slowdown occurs mostly during filesort, and partially due to temporary table.

The question is
1) Which is a desirable tables structure which will allow to perform fulltextsearch + ordering
2) Which is preferable LEFT JOIN .. ON, or WHERE ? I ahaven't noticed any significant difference trying both of them

EXPLAIN:

+----+-------------+---------------+----------+---------------------+-------------------+---------+--------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+----------+---------------------+-------------------+---------+--------------------------+------+----------------------------------------------+
| 1 | SIMPLE | search | fulltext | title_description | title_description | 0 | | 1 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | lot | eq_ref | PRIMARY,category_id | PRIMARY | 4 | tauction.search.lot_id | 1 | Using where |
| 1 | SIMPLE | exchange_rate | ref | currency_id | currency_id | 4 | tauction.lot.currency_id | 2 | Using index |
| 1 | SIMPLE | category_map | ref | parent_id,child_id | child_id | 4 | tauction.lot.category_id | 12 | Using where |
+----+-------------+---------------+----------+---------------------+-------------------+---------+--------------------------+------+----------------------------------------------+
4 rows in set (0.00 sec)


Table structure as follows.

DROP TABLE IF EXISTS lot;
CREATE TABLE lot (
id int unsigned not null primary key auto_increment,
owner_id int unsigned not null default 0,

title char(50) not null,
description text not null,

current_price decimal(16,2) unsigned not null,
quantity int unsigned not null,
bid_count int unsigned not null,

start_time int unsigned not null,
end_time int unsigned not null,

city char(35) not null,

category_id int unsigned not null,
currency_id int unsigned not null,
country_id int unsigned not null,
state_id int unsigned not null,
delivery_id int unsigned not null,

enabled bool default 1 not null,

INDEX title (title),

INDEX current_price (current_price),
INDEX bid_count (bid_count),

INDEX end_time (end_time),

INDEX category_id (category_id)

) TYPE = MyISAM COMMENT = "Lots";

DROP TABLE IF EXISTS search;
CREATE TABLE search (
lot_id int unsigned not null,

title char(50) not null,
description text not null,

FULLTEXT title (title),
FULLTEXT title_description (title, description)
) TYPE = MyISAM COMMENT = "Search helper";

DROP TABLE IF EXISTS category_map;
CREATE TABLE category_map (
parent_id int unsigned not null,
child_id int unsigned not null,
INDEX parent_id (parent_id),
INDEX child_id (child_id)
) TYPE = MyISAM COMMENT = "Categories Map";

Options: ReplyQuote


Subject
Views
Written By
Posted
Optimize query and/or db structure, FullText search + sort by other fields
4683
November 05, 2004 06:38AM
2367
November 06, 2004 06:55PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.