MySQL Forums
Forum List  »  Optimizer & Parser

Use of DISTINCT in INNER JOIN
Posted by: Steve Croall
Date: March 24, 2009 12:07PM

When I use DISTINCT in an INNER JOIN query that returns more than 1 column in the SELECT a full table scan is performed rather than using an index. When 1 column is returned in the SELECT the index is used.

(1) Create the following tables:

CREATE TABLE tab1
(
col1 int NOT NULL,
col2 varchar(24) NOT NULL,
col3 int NOT NULL,

CONSTRAINT pk_tab1 PRIMARY KEY CLUSTERED (col1)
);

CREATE TABLE tab2
(
col1 int NOT NULL,
col2 varchar(24) NOT NULL,
col3 int NOT NULL,

CONSTRAINT pk_tab2 PRIMARY KEY CLUSTERED (col1, col2)
);

ALTER TABLE tab2 ADD CONSTRAINT fk_tab2 FOREIGN KEY (col1) REFERENCES tab1 (col1) ON DELETE CASCADE;

CREATE INDEX ix_tab2 ON tab2 (col2);

(2) Next run the following SELECTs using EXPLAIN:

EXPLAIN SELECT DISTINCT a.col1
FROM tab1 a INNER JOIN tab2 b
ON a.col1 = b.col1
WHERE b.col2 in ('value1', 'value2')
ORDER BY a.col1
;
+----+-------------+-------+-------+-----------------+---------+---------+----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-----------------+---------+---------+----------------+------+----------------------------------------------+
| 1 | SIMPLE | a | index | PRIMARY | PRIMARY | 4 | NULL | 1 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | b | ref | PRIMARY,ix_tab2 | PRIMARY | 4 | brmuser.a.col1 | 1 | Using where; Using index; Distinct |
+----+-------------+-------+-------+-----------------+---------+---------+----------------+------+----------------------------------------------+

EXPLAIN SELECT DISTINCT a.col1, a.col2
FROM tab1 a INNER JOIN tab2 b
ON a.col1 = b.col1
WHERE b.col2 in ('value1', 'value2')
ORDER BY a.col1
;
+----+-------------+-------+------+-----------------+---------+---------+----------------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-----------------+---------+---------+----------------+------+------------------------------------+
| 1 | SIMPLE | a | ALL | PRIMARY | NULL | NULL | NULL | 1 | Using temporary; Using filesort |
| 1 | SIMPLE | b | ref | PRIMARY,ix_tab2 | PRIMARY | 4 | brmuser.a.col1 | 1 | Using where; Using index; Distinct |
+----+-------------+-------+------+-----------------+---------+---------+----------------+------+------------------------------------+

The addition of the second column, a.col2, in the SELECT causes a full table scan.

Any ideas why this is? Could it be related to an existing bug: http://bugs.mysql.com/bug.php?id=41994

Showing the execution plan of the same queries in MSSQL shows both using the PRIMARY key.

Thanks.

Options: ReplyQuote


Subject
Views
Written By
Posted
Use of DISTINCT in INNER JOIN
66755
March 24, 2009 12:07PM
10896
March 24, 2009 02:00PM
6724
March 25, 2009 07:01PM
6435
March 27, 2009 05:20AM
5636
March 27, 2009 10:01AM
6607
March 28, 2009 02:02AM
4507
March 28, 2009 03:53PM
5700
April 01, 2009 03:36AM
5977
April 02, 2009 12:16AM
5443
April 02, 2009 06:29AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.