MySQL :: Re: InnoDB and FULLTEXT

New Topic

Re: InnoDB and FULLTEXT

Posted by: James Day
Date: January 28, 2005 04:45PM

We used to do it real time but load issues at peak times forced us to switch to batch updating. A few months ago slave load issues caused us to stop replicating the search table during those batch updates and now it's done on the master only and copied to offline slaves. To give some idea, a load table from master for all of our current article text takes about 3-4 hours to run on a dual Opteron slave when it's out of service. For some idea of volume, we see about 400,000 to 500,000 page changes per week. Using batch operations significantly cut the number of updates we needed to because the same page tends to be edited many times. The dual part in the Opteron doesn't seem to be of much value for this - seems to be single-threaded.

Today index updates are processed on the master via a cron job and I take pairs of slaves offline off peak to copy the tables across. We've pretty much outgrown MyISAM fulltext search and have been experimenting with other approaches, notably Lucene, which, with its index merging capability, may help to handle the update time issues we've encountered. On the other hand, most places aren't sites in the top 100 English language or 150 worldwide, with our volume of updates, so it's very capable for most users. We're definitely in the push it until it breaks category.

For our load, a 3GB P4 using a java-based Lucene engine handled load comparable to about 2-3 4GB dual Opterons with 6 disk RAID 10 sets. Assorted issues still to be resolved before we actually switch - technical and political - and it might not happen. Until then I plan to experiment with some non-replicating 3GB P4s and see how the lower disk load and fairly high RAM level lets them do compared to the main database servers. I expect them to do fairly well - better in cost-effectiveness than the standard database servers - but we'll see. May be glitches in our code which rules them out or my expectation that they will do fairly well may be disproved.

At present, search is about 50-70% of our database load and an ongoing scalability challenge because it's one of the things we don't cache. We're seeing an increasing tendency of people to come to us first. That's good for mindshare but not so good for an area where we're still addressing scalability...:)

Note also that our search query and schema has significant overhead not related to MySQL full text search - the full text portion is actually by far the faster part of the search. Fixes for this are planned in the next version of the MediaWiki software.

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

InnoDB and FULLTEXT

14335

Gilles Massen

January 24, 2005 07:13AM

Re: InnoDB and FULLTEXT

8498

Heikki Tuuri

January 24, 2005 01:10PM

Re: InnoDB and FULLTEXT

6718

Gilles Massen

January 25, 2005 01:37AM

Re: InnoDB and FULLTEXT

6030

James Day

January 26, 2005 07:50AM

Re: InnoDB and FULLTEXT

5583

Gilles Massen

January 28, 2005 04:34AM

Re: InnoDB and FULLTEXT

9977

James Day

January 28, 2005 04:45PM

Re: InnoDB and FULLTEXT

5409

Gilles Massen

February 01, 2005 07:04AM

Re: InnoDB and FULLTEXT

6220

Naveen Kumar

February 28, 2005 12:54AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.