Re: GEO Search with large table is very slow
Posted by: Rick James
Date: March 18, 2009 09:02PM
You've got a good handle on how indexes work. Some notes:
"In my research, I was under the impression that only one index can be used per query." -- Yes. However, beginning with 5.0, it can _sometimes_ use two indexes via "index merge".
"MySQl picks the best one." -- It tries. And usually it succeeds.
"So I decided to make indexes with several columns." -- Not as good as having a compound index _when_ it can use multiple columns from such an index.
"My understanding was that MySQl would use all of the columns in the index as long as every column was being used in the query." -- I would phrase it differently... It will use the fields, from left to right, as long as the WHERE clause has "=". Once it hits a 'range' (>, IN, BETWEEN, ...), it stops with that field. If you finish with all the WHERE parts, then it might be able to use the next field for ORDER BY.
OTOH, if the entire select can be satisfied by some or all the fields in one index, it may use that index instead of having to reach over into the data. The EXPLAIN will say "Using index".
"This is why I didn't include longitude in the index." -- But if you have INDEX(xref_status_id, longitude) (not the opposite order), it can use both parts. Note the "=" in "xref_status_id = '3'"
"I also noticed that after a certain number of columns were added to an index, it didn't seem to help anymore." -- I suspect that is the wrong conclusion. Instead, other things (discussed above) make this appear to be the case.
"This threshold was reached once the index contained more rows[sic] than the table." -- How can an index have more columns than in the table?
Your idea of separating the big text fields from the search criteria is a good one. And you seemed to understand why. Why it gets better and worse is a puzzle.
Caching -- Aside from the Query Cache, the data blocks and index blocks are cached in RAM. Right after you restart mysql, you will find that a query like the one being discussed takes a long time. If you run exactly the same query (but avoid the Query Cache) you will find that it runs 10x faster. Why? The first run had to fetch lots of stuff from disk. The second query did not. Then, if you run a similar query with different values, its running time will probably be somewhere in between. Why? Some, but not all, of the necessary blocks will be in cache. My mantra "count the disk hits".
Is your 'slowness' just right after restarting mysql?
ANALYZE TABLE properties_2_stats
when the 'slowness' hits. It _might_ help.