MySQL Forums
Forum List  »  Performance

Re: 25.000+ rows - MySQL SELECT consecutive numbers performance issue
Posted by: Rick James
Date: December 14, 2014 01:42PM

You want performance? I give you performance.

First, to demonstrate that my code works on your data:
mysql> SELECT * FROM f625060;
+----+
| n  |
+----+
|  1 |
|  2 |
|  3 |
|  5 |
|  7 |
|  8 |
|  9 |
| 10 |
+----+
8 rows in set (0.00 sec)

mysql> SELECT starts, ends
    ->     FROM (
    ->         SELECT  @start AS starts,
    ->                 if(mx=2, n-1, NULL) AS ends,
    ->                 @start := if(mn=1, n, NULL)
    ->             FROM ( SELECT @start = -9 ) v
    ->             JOIN (
    ->                 SELECT max(x) AS mx, min(x) AS mn, COUNT(*) AS ct, n
    ->                     FROM (
    ->                         ( SELECT 1 AS x, n FROM f625060 )
    ->                         UNION ALL
    ->                         ( SELECT 2,    n+1 FROM f625060 )
    ->                     ) z GROUP BY n
    ->                     HAVING ct = 1
    ->             ) w
    ->     ) u
    ->     WHERE starts IS NOT NULL;
+--------+------+
| starts | ends |
+--------+------+
|      1 |    3 |
|      5 |    5 |
|      7 |   10 |
+--------+------+
3 rows in set (0.00 sec)

Now some benchmarking; first populate table with 25K rows:
mysql> SELECT * FROM f625060 LIMIT 10;
+----+
| n  |
+----+
|  0 |
|  1 |
|  4 |
|  5 |
|  6 |
|  8 |
|  9 |
| 10 |
| 12 |
| 13 |

mysql> SELECT COUNT(*), MIN(n), MAX(n) FROM f625060;
+----------+--------+--------+
| COUNT(*) | MIN(n) | MAX(n) |
+----------+--------+--------+
|    25000 |      0 |  39999 |
+----------+--------+--------+
(I pulled 25K random numbers from all ints 0..39999.)

Let me test this code:
FLUSH STATUS;
select  a.n as start, min( c.n ) as end  
from      f625060 as a 
left join f625060 as b on a.n = b.n + 1 
left join f625060 as c on a.n <= c.n     -- This JOIN runs in 'quadratic' time
left join f625060 as d on c.n+1 = d.n 
where b.n is null  
  and c.n is not null 
  and d.n is null 
group by a.n;
SHOW SESSION STATUS LIKE 'Handler%';
but I gave up after 5 minutes.

Now for my code:
FLUSH STATUS;       -- prep for SHOW, below
SELECT starts, ends
    FROM (
        SELECT  @start AS starts,
                if(mx=2, n-1, NULL) AS ends,
                @start := if(mn=1, n, NULL)
            FROM ( SELECT @start = -9 ) v
            JOIN (
                SELECT max(x) AS mx, min(x) AS mn, COUNT(*) AS ct, n
                    FROM (
                        ( SELECT 1 AS x, n FROM f625060 )
                        UNION ALL
                        ( SELECT 2,    n+1 FROM f625060 )
                    ) z GROUP BY n
                    HAVING ct = 1
            ) w
    ) u
    WHERE starts IS NOT NULL;
SHOW SESSION STATUS LIKE 'Handler%';
|  39946 | 39961 |
|  39963 | 39965 |
|  39968 | 39969 |
|  39971 | 39974 |
|  39978 | 39979 |
|  39981 | 39982 |
|  39984 | 39986 |
|  39988 | 39988 |
|  39991 | 39991 |
|  39995 | 39996 |
|  39999 | 39999 |
+--------+-------+
9461 rows in set (0.08 sec)          -- Very good time; 9K ranges from 25K rows with values 0..39999.

mysql> SHOW SESSION STATUS LIKE 'Handler%';
+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Handler_commit             | 1      |
| Handler_delete             | 0      |
| Handler_discover           | 0      |
| Handler_external_lock      | 4      |
| Handler_mrr_init           | 0      |
| Handler_prepare            | 0      |
| Handler_read_first         | 3      |
| Handler_read_key           | 50002  |
| Handler_read_last          | 0      |
| Handler_read_next          | 50000  | -- 2 * 25K, so 'linear' time
| Handler_read_prev          | 0      |
| Handler_read_rnd           | 18922  |
| Handler_read_rnd_next      | 172311 |
| Handler_rollback           | 0      |
| Handler_savepoint          | 0      |
| Handler_savepoint_rollback | 0      |
| Handler_update             | 15539  |
| Handler_write              | 172306 | -- Several tmp tables used; still 'linear'
+----------------------------+--------+
Even a million rows will probably take only a few seconds.

I started to demonstrate a way using several tmp tables and AUTO_INCREMENTs to number the rows, but abandoned it since is was even messier than this.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: 25.000+ rows - MySQL SELECT consecutive numbers performance issue
1295
December 14, 2014 01:42PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.