Skip navigation links

MySQL Forums :: Performance :: optimize IP range join: postgresql is 496 times faster


Advanced Search

optimize IP range join: postgresql is 496 times faster
Posted by: keith cascio ()
Date: October 08, 2008 06:52PM

This is a re-post of my original here.

I am very curious to know how to write the following join query for MySQL so it will perform as well as PostgreSQL. I never used PostgreSQL before now. I like MySQL and I always use it. However I was disappointed by its performance on this query and I installed PostgreSQL to compare. The purpose of this query is to map ip addresses to countries. On my CentOS 5 machine, MySQL 5.0.45 takes 10 mins 45 seconds, PostgreSQL 8.1.11 takes 0 mins 1.3 seconds. Yes MySQL is 496 times slower. I'm sure there must be a way to hint MySQL to run faster. Here is the query:

mysql> select range.id_country from address join range on address.address between range.begin_num and range.end_num;

mysql> describe address;
+---------+------------------+------+-----+---------+-------+
| Field   | Type             | Null | Key | Default | Extra |
+---------+------------------+------+-----+---------+-------+
| address | int(10) unsigned | YES  |     | NULL    |       |
+---------+------------------+------+-----+---------+-------+

mysql> describe range;
+-------------+---------------------+------+-----+---------+-------+
| Field       | Type                | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+-------+
| begin_num   | int(10) unsigned    | NO   | PRI |         |       |
| end_num     | int(10) unsigned    | YES  | UNI | NULL    |       |
| id_country  | tinyint(3) unsigned | YES  | MUL | NULL    |       |
+-------------+---------------------+------+-----+---------+-------+
Both tables are MyISAM type. Table `address` is 2124 rows (all distinct). Table `range` is 105920 rows (all distinct).

The best answer I found so far is here. Steinbrink gives a way to write it as a join on subquery that my MySQL finished in 6 min 42 sec (63% the time of the simple join version). That is still 310 times slower than PostgreSQL. Too slow!

Actually there is a small error in the SQL at that url:
ORDER BY ip_address DESC
should be:
ORDER BY start      DESC
Here is MySQL's explanation of the original query:
mysql> explain select range.id_country from address join range on address.address between range.begin_num and range.end_num;
+----+-------------+---------+------+-----------------+------+---------+------+--------+------------------------------------------------+
| id | select_type | table   | type | possible_keys   | key  | key_len | ref  | rows   | Extra                                          |
+----+-------------+---------+------+-----------------+------+---------+------+--------+------------------------------------------------+
|  1 | SIMPLE      | address | ALL  | NULL            | NULL | NULL    | NULL |   2124 |                                                |
|  1 | SIMPLE      | range   | ALL  | PRIMARY,end_num | NULL | NULL    | NULL | 105920 | Range checked for each record (index map: 0x7) |
+----+-------------+---------+------+-----------------+------+---------+------+--------+------------------------------------------------+
Here is PostgreSQL's explanation of the original query:
postgresql# explain select range.id_country from address join range on address.address between range.begin_num and range.end_num;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=5.72..7061709.90 rows=83990942 width=2)
   ->  Seq Scan on range  (cost=0.00..3316.47 rows=185547 width=18)
   ->  Bitmap Heap Scan on address  (cost=5.72..31.25 rows=453 width=8)
         Recheck Cond: ((address.address >= "outer".begin_num) AND (address.address <= "outer".end_num))
         ->  Bitmap Index Scan on addresses_pkey  (cost=0.00..5.72 rows=453 width=0)
               Index Cond: ((address.address >= "outer".begin_num) AND (address.address <= "outer".end_num))
Here is MySQL's explanation showing that a simple query can use the index on begin_num in a "range" type query plan:
mysql> explain select id_country from range where 123123123 between begin_num and end_num;
+----+-------------+---------+-------+-----------------+---------+---------+------+------+-------------+
| id | select_type | table   | type  | possible_keys   | key     | key_len | ref  | rows | Extra       |
+----+-------------+---------+-------+-----------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | range   | range | PRIMARY,end_num | PRIMARY | 4       | NULL |   35 | Using where |
+----+-------------+---------+-------+-----------------+---------+---------+------+------+-------------+
Others encountered this problem before.
References:

[*] MySQL Forums :: Optimizer :: How to optimize a range (best answer I found but not good enough)
[*] MySQL Performance Blog: Optimize IP Range Join / Daniweb: Optimize IP Range Join - MySQL
[*] MySQL Performance Blog: Range Optimization
[*] MySQL :: inner join on range criteria: unable to use index?
[*] Sergey Petrunia's blog Use of join buffer is now visible in EXPLAIN
[*] MySQL :: MySQL 5.0 Reference Manual :: 7.2.5 Range Optimization
[*] MySQL Bugs: #26963: Incorrect query results with CONST join compared to RANGE or ALL
[*] MySQL Bugs: #9693: bug in trying to join a value on a range (using indexes)

Options: ReplyQuote


Subject Views Written By Posted
optimize IP range join: postgresql is 496 times faster 8141 keith cascio 10/08/2008 06:52PM
Re: optimize IP range join: postgresql is 496 times faster 2029 Roland Bouman 12/10/2008 04:18AM
Re: optimize IP range join: postgresql is 496 times faster 2543 keith cascio 12/10/2008 12:14PM
Re: optimize IP range join: postgresql is 496 times faster 2043 Roland Bouman 12/10/2008 01:15PM
Re: optimize IP range join: postgresql is 496 times faster 2049 keith cascio 12/10/2008 01:24PM
Re: optimize IP range join: postgresql is 496 times faster 1947 Roland Bouman 12/10/2008 01:50PM
Re: optimize IP range join: postgresql is 496 times faster 1897 keith cascio 12/12/2008 06:01PM
Re: optimize IP range join: postgresql is 496 times faster 1965 Rick James 12/10/2008 08:19PM
Re: optimize IP range join: postgresql is 496 times faster 1950 keith cascio 12/10/2008 08:52PM
Re: optimize IP range join: postgresql is 496 times faster 1917 keith cascio 12/12/2008 06:03PM
Re: optimize IP range join: postgresql is 496 times faster 2079 Rick James 12/10/2008 10:46AM
Re: optimize IP range join: postgresql is 496 times faster 2236 keith cascio 12/10/2008 12:30PM
Re: optimize IP range join: postgresql is 496 times faster 2030 keith cascio 12/10/2008 12:43PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.