MySQL Forums
Forum List  »  GIS

Re: Zip Code Proximity search
Posted by: Gregor Melhorn
Date: October 24, 2004 12:34PM

Anyway, I tried using the precalculation version, with really good results:

In Germany, there exist around 8300 different zips.

I created 3 range tables:

Zip range 0 to 100 km has 5,757,601 records, 94.9 MB
Zip range 100 to 150 km has 5,671,204 records, 92.6 MB
Zip range 150 to 250 km (seldomly used) has 14,223,809 records, 231.8 MB.

Table structure:

CREATE TABLE `zip100` (
`zip1` mediumint( 5 ) unsigned zerofill NOT NULL default '00000',
`distance` tinyint( 3 ) unsigned NOT NULL default '0',
`zip2` mediumint( 5 ) unsigned zerofill NOT NULL default '00000',
`land1` enum( 'D', 'CH', 'A', 'NL' ) NOT NULL default 'D',
`land2` enum( 'D', 'CH', 'A', 'NL' ) NOT NULL default 'D',
KEY `land` ( `land1` , `zip1` , `distance` , `zip2` , `plz2` )
) TYPE = MYISAM PACK_KEYS =1;

I only need distances up to 250 km, so an unsigned tinyint is sufficient.

I'm not shure if "pack_keys" noticably influences performance - maybe someone knows and could post it here?

At present there are no zips other than from Germany in the database.

Values were inserted using a perl script to fetch data from the opengeodb project. This usually took around 40 - 50 min on my 1800 Mhz Developer Machine, 512 MB RAM.

Optimizing:

Make shure nothing will access the tables while optimizing, else they might get corrupted.

Tables were physically sorted using "myisamchk --sort-records=1 <table>"

Then I used "myisampack <table>" for compression (usually around 30%). Again I'm not shure if compression noticably influences performance, will check on this...

After compression, I executed "myisamchk -rqSa <table>".

No, what do we get from this?

"Select land2, plz2 from zip100 where land1='D', plz1=74585" will get you: 1105 rows in set (0.01 sec)

This can now be used in a join with the user table.

On my developer machine - Windows XP Pro, completely unoptimized, lacking memory like hell and with no buffers in mysql optimized - results are returned at around 0.00 - 0.03 seconds, if queried from the zip100 table. zip250 table takes more time, usually 0.07 - 0.09 seconds.

This can still be optimized a lot, since on my machine this data has to be accessed via hard disk. If optimizing buffers and providing enough memory, there should be results at max reading speed without sorting needed, I suppose.

Options: ReplyQuote


Subject
Views
Written By
Posted
28545
October 19, 2004 06:04AM
10947
October 19, 2004 07:55AM
9353
October 19, 2004 11:14AM
10378
October 21, 2004 09:28AM
9267
October 19, 2004 06:59PM
7722
October 20, 2004 07:39AM
7729
October 21, 2004 09:10AM
8982
October 22, 2004 07:17AM
6915
October 23, 2004 02:48AM
6332
October 23, 2004 03:09AM
6465
October 23, 2004 03:12AM
6333
October 23, 2004 02:59PM
Re: Zip Code Proximity search
8108
October 24, 2004 12:34PM
5613
October 24, 2004 01:31PM
13270
October 21, 2005 10:21AM
5668
D C
January 28, 2006 05:24AM
5232
March 02, 2006 04:24PM
7319
October 09, 2007 09:28AM
5837
December 06, 2005 05:34AM
7137
December 06, 2005 06:36AM
5411
December 24, 2005 01:10PM
8972
December 26, 2005 03:49PM
5548
October 09, 2007 09:36AM
29553
December 13, 2007 04:10PM
6270
April 05, 2006 02:59PM
4818
May 02, 2006 03:22PM
5804
May 05, 2006 09:44AM
14746
June 25, 2006 09:32PM
5612
August 30, 2006 12:54PM
6124
July 14, 2007 01:09AM
7731
November 03, 2006 10:25AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.