MySQL Forums
Forum List  »  Performance

optimize IP range join: postgresql is 496 times faster
Posted by: keith cascio
Date: October 08, 2008 06:52PM

This is a re-post of my original here.

I am very curious to know how to write the following join query for MySQL so it will perform as well as PostgreSQL. I never used PostgreSQL before now. I like MySQL and I always use it. However I was disappointed by its performance on this query and I installed PostgreSQL to compare. The purpose of this query is to map ip addresses to countries. On my CentOS 5 machine, MySQL 5.0.45 takes 10 mins 45 seconds, PostgreSQL 8.1.11 takes 0 mins 1.3 seconds. Yes MySQL is 496 times slower. I'm sure there must be a way to hint MySQL to run faster. Here is the query:

mysql> select range.id_country from address join range on address.address between range.begin_num and range.end_num;

mysql> describe address;
+---------+------------------+------+-----+---------+-------+
| Field   | Type             | Null | Key | Default | Extra |
+---------+------------------+------+-----+---------+-------+
| address | int(10) unsigned | YES  |     | NULL    |       |
+---------+------------------+------+-----+---------+-------+

mysql> describe range;
+-------------+---------------------+------+-----+---------+-------+
| Field       | Type                | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+-------+
| begin_num   | int(10) unsigned    | NO   | PRI |         |       |
| end_num     | int(10) unsigned    | YES  | UNI | NULL    |       |
| id_country  | tinyint(3) unsigned | YES  | MUL | NULL    |       |
+-------------+---------------------+------+-----+---------+-------+
Both tables are MyISAM type. Table `address` is 2124 rows (all distinct). Table `range` is 105920 rows (all distinct).

The best answer I found so far is here. Steinbrink gives a way to write it as a join on subquery that my MySQL finished in 6 min 42 sec (63% the time of the simple join version). That is still 310 times slower than PostgreSQL. Too slow!

Actually there is a small error in the SQL at that url:
ORDER BY ip_address DESC
should be:
ORDER BY start      DESC
Here is MySQL's explanation of the original query:
mysql> explain select range.id_country from address join range on address.address between range.begin_num and range.end_num;
+----+-------------+---------+------+-----------------+------+---------+------+--------+------------------------------------------------+
| id | select_type | table   | type | possible_keys   | key  | key_len | ref  | rows   | Extra                                          |
+----+-------------+---------+------+-----------------+------+---------+------+--------+------------------------------------------------+
|  1 | SIMPLE      | address | ALL  | NULL            | NULL | NULL    | NULL |   2124 |                                                |
|  1 | SIMPLE      | range   | ALL  | PRIMARY,end_num | NULL | NULL    | NULL | 105920 | Range checked for each record (index map: 0x7) |
+----+-------------+---------+------+-----------------+------+---------+------+--------+------------------------------------------------+
Here is PostgreSQL's explanation of the original query:
postgresql# explain select range.id_country from address join range on address.address between range.begin_num and range.end_num;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=5.72..7061709.90 rows=83990942 width=2)
   ->  Seq Scan on range  (cost=0.00..3316.47 rows=185547 width=18)
   ->  Bitmap Heap Scan on address  (cost=5.72..31.25 rows=453 width=8)
         Recheck Cond: ((address.address >= "outer".begin_num) AND (address.address <= "outer".end_num))
         ->  Bitmap Index Scan on addresses_pkey  (cost=0.00..5.72 rows=453 width=0)
               Index Cond: ((address.address >= "outer".begin_num) AND (address.address <= "outer".end_num))
Here is MySQL's explanation showing that a simple query can use the index on begin_num in a "range" type query plan:
mysql> explain select id_country from range where 123123123 between begin_num and end_num;
+----+-------------+---------+-------+-----------------+---------+---------+------+------+-------------+
| id | select_type | table   | type  | possible_keys   | key     | key_len | ref  | rows | Extra       |
+----+-------------+---------+-------+-----------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | range   | range | PRIMARY,end_num | PRIMARY | 4       | NULL |   35 | Using where |
+----+-------------+---------+-------+-----------------+---------+---------+------+------+-------------+
Others encountered this problem before.
References:

[*] MySQL Forums :: Optimizer :: How to optimize a range (best answer I found but not good enough)
[*] MySQL Performance Blog: Optimize IP Range Join / Daniweb: Optimize IP Range Join - MySQL
[*] MySQL Performance Blog: Range Optimization
[*] MySQL :: inner join on range criteria: unable to use index?
[*] Sergey Petrunia's blog Use of join buffer is now visible in EXPLAIN
[*] MySQL :: MySQL 5.0 Reference Manual :: 7.2.5 Range Optimization
[*] MySQL Bugs: #26963: Incorrect query results with CONST join compared to RANGE or ALL
[*] MySQL Bugs: #9693: bug in trying to join a value on a range (using indexes)

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.