MySQL Forums
Forum List  »  Optimizer & Parser

Query with GROUP BY (Using index for group-by -> Using index)
Posted by: mnf
Date: November 30, 2006 05:20AM

Hello!

I have a very simple table for testing with ~7.500000 rows and all needed keys:

+-------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+----------------------+------+-----+---------+----------------+
| i | int(10) unsigned | NO | PRI | NULL | auto_increment |
| n | smallint(5) unsigned | NO | MUL | | |
| t | int(10) unsigned | NO | MUL | | |
| v | smallint(6) | NO | | | |
+-------+----------------------+------+-----+---------+----------------+

Typical query with GROUP BY is optimized great:

mysql> EXPLAIN SELECT MAX(i) FROM shn1 GROUP BY n;
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| 1 | SIMPLE | shn1 | range | NULL | ni | 2 | NULL | 216 | Using index for group-by |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+

mysql> SELECT MAX(i) FROM shn1 GROUP BY n;
...
215 rows in set (0.00 sec)

Now I want all row (i, n, t, v) with MAX(i), I tried this:

1) subquery
mysql> EXPLAIN SELECT i, n, t, v FROM shn1 WHERE i IN (SELECT MAX(i) FROM shn1 GROUP BY n);
+----+--------------------+-------+-------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+------+---------+------+---------+-------------+
| 1 | PRIMARY | shn1 | ALL | NULL | NULL | NULL | NULL | 7699429 | Using where |
| 2 | DEPENDENT SUBQUERY | shn1 | index | NULL | ni | 6 | NULL | 7699429 | Using index |
+----+--------------------+-------+-------+---------------+------+---------+------+---------+-------------+

As you see it is very bad, it uses DEPENDENT QUERY (btw, how to override?) and ALL scan and of course it takes hours.

2) subquery
mysql> EXPLAIN SELECT i, n, t, v FROM shn1 AS T1, (SELECT MAX(i) AS ii FROM shn1 GROUP BY n) AS T2 WHERE T1.i = T2.ii;
+----+-------------+------------+--------+---------------+---------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+-------+---------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 215 | |
| 1 | PRIMARY | T1 | eq_ref | PRIMARY,in,it | PRIMARY | 4 | T2.ii | 1 | |
| 2 | DERIVED | shn1 | index | NULL | ni | 6 | NULL | 7699429 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+-------+---------+-------------+

mysql> SELECT i, n, t, v FROM shn1 AS T1, (SELECT MAX(i) AS ii FROM shn1 GROUP BY n) AS T2 WHERE T1.i = T2.ii;
...
215 rows in set (14.26 sec)

Also not good.

3) join
mysql> EXPLAIN SELECT MAX(T1.i), T2.n, T2.t, T2.v FROM shn1 AS T1 LEFT JOIN shn1 AS T2 ON T2.i = T1.i GROUP BY T1.n;
+----+-------------+-------+-------+---------------+------+---------+---------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+---------+---------+-------------+
| 1 | SIMPLE | T1 | index | NULL | ni | 6 | NULL | 7699429 | Using index |
| 1 | SIMPLE | T2 | ref | PRIMARY,in,it | it | 4 | br.T1.i | 1 | |
+----+-------------+-------+-------+---------------+------+---------+---------+---------+-------------+

mysql> SELECT MAX(T1.i), T2.n, T2.t, T2.v FROM shn1 AS T1 LEFT JOIN shn1 AS T2 ON T2.i = T1.i GROUP BY T1.n;
...
215 rows in set (6 min 21.77 sec)

Still slow, it uses index with 7699429 rows instead of original index for group-by with 216 rows.

4) temporary table
mysql> CREATE TEMPORARY TABLE T SELECT MAX(i) FROM shn1 GROUP BY n;
Query OK, 215 rows affected (14.02 sec)

This is very surprising, I'm almost sure it uses "index with 7699429 rows" again - time is very close to
mysql> EXPLAIN SELECT * FROM (SELECT MAX(i) FROM shn1 GROUP BY n) AS T1;
+----+-------------+------------+-------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+------+---------+------+---------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 215 | |
| 2 | DERIVED | shn1 | index | NULL | ni | 6 | NULL | 7699429 | Using index |
+----+-------------+------------+-------+---------------+------+---------+------+---------+-------------+

mysql> SELECT * FROM (SELECT MAX(i) FROM shn1 GROUP BY n) AS T1;
...
215 rows in set (13.86 sec)

All this is very strange, seems it doesnt want to use "index for group-by" in any combination, including CREATE TABLE, INSERT INTO etc. Please help me understand.

Thank you in advance,

Michael

---

P.S. I'm playing with MySQL 5.0.27-Debian_1-log (same thing with 5.0.22), analyzed, optimized tables.

Options: ReplyQuote


Subject
Views
Written By
Posted
Query with GROUP BY (Using index for group-by -> Using index)
9600
November 30, 2006 05:20AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.