MySQL Forums
Forum List  »  Performance

Re: GEO Search with large table is very slow
Posted by: Rick James
Date: April 04, 2009 12:51PM

The order of ANDed things in the WHERE clause never(?) matters. a=1 AND b=1 will never(?) be optimized differently than b=1 AND a=1.

The order of fields in a compound index DOES matter. Often severely. Given INDEX (beds, baths, city)
all of these are very good (assuming this is the entire WHERE clause):
city = 1 AND baths = 2 AND beds = 2 -- all '='
city IN (1,4,7) AND baths = 2 AND beds = 2 -- first fields in index are '='

Less good:
city = 1 AND baths = 2 AND beds BETWEEN 2 AND 4 -- Won't get past beds
city = 1 AND baths IN (1.5, 2) AND beds =3 -- Won't get past baths

Really bad:
(beds = 2 OR beds = 3) AND ... -- might not even use the index.*

*Well, it might -- there have been improvements, and there is the "merge index" feature starting with 5.0.

OR is usually the worst -- it likes to throw up its hands and degenerate into a scan
IN is iffy -- it might feel like OR, or it might turn into a 'range'
BETWEEN (or <, >, LIKE 'xx%', etc) -- these are 'range scans'; they are implemented by jumping into the index at the starting value and scanning forward. (Think of looking you up in a directory, knowing only your last name, and street address. -- Jump to the first Gonsman, then scan forward.)

Another example: Looking up "Gonsman, S. in Canada" becomes
lastname = 'Gonsman' AND firstname LIKE 'S%' AND country = 'CA'
If the index is (lastname, firstname, country), it will start at the first Gonsman, S, scan forward, and probably ignore the country during the scan.
If the index is (lastname, country, firstname), that will be more efficient (though, perhaps less logical). Now you have '=' on the first two fields in the index, and will need to scan only the Canadian Gonsman's beginning with S.

(Apologies to you or Canada if the match is wrong.)

Technically, IN is not a "range". But IN (1,2,3) where the datatype is INT can easily be optimized into a the range BETWEEN 1 AND 3.

Benchmarking...
* restart mysql -- then nothing is cached in memory. (Yuck)
* FLUSH TABLES; -- this might get you to a clean state.
* Do some big operation on a big table (OPTIMIZE a big enough table that you are not timing) -- this will take a little time, but will flush caches.
* SHOW STATUS LIKE 'Com_%'; before and after, then take differences. (Or there is a way to reset the values, so you don't need the diffs.)
* Watch the slowlog. -- "Rows examined" is really high when you have one of the "bad" WHERE / INDEX combos.

Get on 5.0 or later, read
http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html
watch the EXPLAIN plan, and try to tune your queries so that they take advantage index-merge. Please report back on what does and does not trigger use of index-merge. In particular I would like to know if a lat+long query could use index-merge. (This would require separate indexes starting with lat and with long.) The last column of EXPLAIN will tell you.

Remember:
(latitude BETWEEN ... AND ...) OR latitude IS NULL
is a 'bad' OR. Consider turning that into a UNION.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: GEO Search with large table is very slow
3329
April 04, 2009 12:51PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.