I'll be the first to admit that I don't know a whole lot about databases, but I was recently running into a performance problem with a query that mysql implemented as two nested full table scans. If I were to implement the join outside of a database, I would do it very differently.
Here's a stripped down example: say you have two tables
Create Table: CREATE TABLE `t1` (
`prim` int(11) NOT NULL auto_increment,
`f` varchar(20) default NULL,
PRIMARY KEY (`prim`),
KEY `f` (`f`)
) ENGINE=InnoDB AUTO_INCREMENT=2913 DEFAULT CHARSET=latin1
CREATE TABLE `t2` (
`prim` int(11) NOT NULL auto_increment,
`f` varchar(20) default NULL,
PRIMARY KEY (`prim`)
) ENGINE=InnoDB AUTO_INCREMENT=93185 DEFAULT CHARSET=latin1
Now say I wanted to do this query:
mysql> explain select t1.prim,t2.f from t1 left outer join t2 on t1.f=t2.f;
+----+-------------+-------+-------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | t1 | index | NULL | f | 23 | NULL | 3075 | Using index |
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 88416 | |
+----+-------------+-------+-------+---------------+------+---------+------+-------+-------------+
I printed out the table statuses at the end of this post.
So mysql is going to scan through the t1's index on f, and for every row, it'll do a full table scan of t2. But that doesn't make a whole lot of sense -- especially since I happened to have ordered t2 on the f column. What I would want to happen is for t2.f to be filesorted (if it wasn't already ordered that way) and then t1's f index and the sorted list to be scanned in parallel: for every row of t1.f, check whether there is a matching t2.f. If not, output a NULL and advance t1.f. If so, output a row for every matching t2.f, then advance t1.f. If it changes, advance t2.f. Or something like that.
As I understand mysql's optimizer, it never considers this option. It always arranges the tables in some order, then grabs one row for the first table (or index on the table or whatever), then does the next table's lookup operation, then the next, etc. until the full join is processed. So there's no way of describing a parallel scan.
But it doesn't feel like it would be that much of a stretch to add one in, which makes me wonder if I'm just missing something and it's already there. Can anyone confirm or deny it?
The most straightforward way that I can think of wedging in this operation, assuming it doesn't already exist, would be to either add a 'Extra' to type 'ALL' which says to use a cursor (sorry; that's an overloaded term), or to make a new type. Either way, you execute the query as if it were nested full table scans or whatever, but you know that the inner table scan can always start from (almost) where it left off last time. The cursor would always scan until just before the next one greater than the search key, and wouldn't actually advance for the purposes of the next iteration if the search key was found. But the point is that this could preserve the general structure of mysql's query plans while allowing an optimization that it seems to me would be applicable in quite a few situations.
(There are additional wrinkles that I can think of of the top of my head. For triply-nested loops, or worse, you'd need to reset the cursor to the beginning when the outer outer loop advanced. And you could simplify the cursor advancement logic if either the outer or the inner scan was known to have unique values. And I'm sure many other things.)
Am I crazy or just naive? In the case of my query, this would bring the number of rows examined down from the product of the two tables sizes to their sum, which seems worthwhile.
----
mysql> show table status like 't1' \G
*************************** 1. row ***************************
Name: t1
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 2528
Avg_row_length: 64
Data_length: 163840
Max_data_length: 0
Index_length: 196608
Data_free: 0
Auto_increment: 2913
Create_time: 2009-05-18 12:03:08
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment: InnoDB free: 1379328 kB
1 row in set (0.00 sec)
mysql> show table status like 't2' \G
*************************** 1. row ***************************
Name: t2
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 90870
Avg_row_length: 63
Data_length: 5783552
Max_data_length: 0
Index_length: 0
Data_free: 0
Auto_increment: 93185
Create_time: 2009-05-19 14:35:17
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment: InnoDB free: 4096 kB