MySQL Forums
Forum List  »  General

Re: Optimised way to search over 2 milllion poi data in mysql
Posted by: Rick James
Date: September 02, 2014 12:11AM

> we want to find the nearest point with out bothering how far the location is

You may not care how far it is, but you do have to measure it in order to determine that it is closer than all others.

Let's start over... If there are no indexes and no partitioning, you have to scan the entire table, checking each one to see if it is the "nearest". If the table is bigger than will fit in cache, that means reading the entire table every time. Clearly, this is not a good solution.

Now, let's add the obvious
INDEX(latitude, longitude)
and the obvious
WHERE latitude BETWEEN 40 AND 44 AND longitude BETWEEN 107 and 111
Now, let's analyze how that will work. The optimizer simply cannot do anything with longitude; all it can do is search all entries with latitude BETWEEN 40 AND 44, filtering on longitude as it goes. This is a significant fraction of the total number of rows, and it is significantly more than the desired 4 degrees by 4 degrees that one would imagine the index would provide.

Indexes are inherently 1-dimensional, yet the problem is 2-dimensional. So, the trick I present is to use partition pruning on latitude as the first dimension, at least cutting down the search area. Then the clustered PRIMARY KEY to gives the second dimensional search.

Think of the number of disk hits. If there are no results at +/-2, then there was one disk hit to find nothing. If +/-4 also comes up empty, then again, no (or virtually no) disk hits. Eventually, you get a result. At that is some fine tuning, such as using the Great Circle formula to be precise about which one is "nearest" (in case several show up).

Options: ReplyQuote


Subject
Written By
Posted
Re: Optimised way to search over 2 milllion poi data in mysql
September 02, 2014 12:11AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.