MySQL Forums
Forum List  »  Quality Assurance

Last row performance issue with LIMIT clause
Posted by: Doug Carr
Date: November 09, 2007 01:42PM

Hi,

In both 4.1.22 and 5.0.45, I have found that having a LIMIT clause on a SELECT statement where the LIMIT would include the final row of the table causes poor performance. So, if I have the table:

mysql> desc activity;
+--------------+-----------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------+------+-----+---------------------+----------------+
| id | mediumint(9) | | PRI | NULL | auto_increment |
| timestamp | datetime | | MUL | 0000-00-00 00:00:00 | |
| client_id | varchar(25) | | MUL | | |
| ip_addr | varchar(25) | YES | | NULL | |
| entity_host | varchar(100) | YES | | NULL | |
| status | int(3) unsigned | | | 200 | |
| user_agent | varchar(100) | YES | | NULL | |
| content_type | varchar(50) | YES | | NULL | |
| entity_size | int(11) | YES | | 0 | |
| object | varchar(255) | YES | | NULL | |
+--------------+-----------------+------+-----+---------------------+----------------+

with the keys:
mysql> show index from activity;
+----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| activity | 0 | PRIMARY | 1 | id | A | 210708 | NULL | NULL | | BTREE | |
| activity | 1 | timestamp | 1 | timestamp | A | 105354 | NULL | NULL | | BTREE | |
| activity | 1 | timestamp | 2 | id | A | 210708 | NULL | NULL | | BTREE | |
| activity | 1 | client_id | 1 | client_id | A | 19155 | NULL | NULL | | BTREE | |
| activity | 1 | client_id | 2 | id | A | 210708 | NULL | NULL | | BTREE | |
+----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
5 rows in set (0.02 sec)
and the row count:

mysql> select count(*) from activity;
+----------+
| count(*) |
+----------+
| 210708 |
+----------+
1 row in set (0.02 sec)

then this query is fast:

mysql> select * from activity order by id asc limit 1 offset 210706;
+--------+---------------------+-----------+-----------+-------------+--------+------------+--------------+-------------+--------+
| id | timestamp | client_id | ip_addr | entity_host | status | user_agent | content_type | entity_size | object |
+--------+---------------------+-----------+-----------+-------------+--------+------------+--------------+-------------+--------+
| 210707 | 2007-10-17 07:13:17 | cec | 127.0.0.1 | tom | 200 | Mozilla | text | 37256 | bob |
+--------+---------------------+-----------+-----------+-------------+--------+------------+--------------+-------------+--------+
1 row in set (0.89 sec)

but this one is slow:

mysql> select * from activity order by id asc limit 1 offset 210707;
+--------+---------------------+-----------+---------+-------------+--------+------------+--------------+-------------+--------+
| id | timestamp | client_id | ip_addr | entity_host | status | user_agent | content_type | entity_size | object |
+--------+---------------------+-----------+---------+-------------+--------+------------+--------------+-------------+--------+
| 210734 | 2007-11-08 07:07:07 | | NULL | NULL | 200 | NULL | NULL | 0 | NULL |
+--------+---------------------+-----------+---------+-------------+--------+------------+--------------+-------------+--------+
1 row in set (29.00 sec)

If I explain the queries, the fast one uses indexes, the slow one uses file sort:
mysql> explain select * from activity order by id asc limit 1 offset 210706;
+----+-------------+----------+-------+---------------+---------+---------+------+--------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+--------+-------+
| 1 | SIMPLE | activity | index | NULL | PRIMARY | 3 | NULL | 210708 | |
+----+-------------+----------+-------+---------------+---------+---------+------+--------+-------+
1 row in set (0.00 sec)

mysql> explain select * from activity order by id asc limit 1 offset 210707;
+----+-------------+----------+------+---------------+------+---------+------+--------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+------+---------+------+--------+----------------+
| 1 | SIMPLE | activity | ALL | NULL | NULL | NULL | NULL | 210708 | Using filesort |
+----+-------------+----------+------+---------------+------+---------+------+--------+----------------+
1 row in set (0.00 sec)

Looking at the source code, in sql_select.cpp line 12482 (in the 5.0.45 source) or line 7836 (in the 4.1.22 source) there is this:

/*
If not used with LIMIT, only use keys if the whole query can be
resolved with a key; This is because filesort() is usually faster than
retrieving all rows through an index.
*/
if (select_limit >= table->file->records)
{
scanning keys...
}
else
usable keys...

If I change that '=>' in 'select_limit >= table->file->records' to '>' then both queries are fast and neither 'explain' uses filesort. If I exceed the number of rows (for example, limit 2 offset 210707, or limit 1 offset 210708) then the query is slow again (and explain shows filesort). I can accept that, as my code can make sure that the offset + row count in the limit clause doesn't exceed the table size.

The >= vs > looks like an off-by-one error, but my worry is that it may be like that for a reason that isn't obvious to me.

So, my question is: will there be any harmful side effects if the >= is changed to >?

Thanks,
Doug.

Options: ReplyQuote


Subject
Views
Written By
Posted
Last row performance issue with LIMIT clause
4660
November 09, 2007 01:42PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.