Derived tables optimized differently than when directly queried?
This is the show create for the table in question
CREATE TABLE `logevents` (
`id` int(11) NOT NULL auto_increment,
`eventtime` datetime default NULL,
`remotehost` varchar(30) default NULL,
`computername` varchar(30) default NULL,
`eventtype` varchar(10) default NULL,
`username` varchar(10) default NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `eventtime` (`eventtime`,`computername`,`eventtype`),
KEY `idxeventtime` (`eventtime`),
KEY `idxcomputername` (`computername`),
KEY `idxusername` (`username`),
KEY `idxcnametime` (`computername`,`eventtime`)
) ENGINE=MyISAM AUTO_INCREMENT=95283 DEFAULT CHARSET=utf8 |
When i run the following query:
mysql> explain select computername,max(eventtime) as "maxtime" from logevents group by computername;
+----+-------------+-----------+-------+---------------+--------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+---------------+--------------+---------+------+------+--------------------------+
| 1 | SIMPLE | logevents | range | NULL | idxcnametime | 93 | NULL | 644 | Using index for group-by |
+----+-------------+-----------+-------+---------------+--------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
as you can see, it uses the index for group by and searches with a range, resulting in a relatively low amount of rows searched, as opposed to when this exact query is used to create a derived table in the query
mysql> explain select a.computername,a.username,a.eventtype,a.eventtime from logevents a, (select computername,max(eventtime) as "maxtime" from logevents group by computername) b where a.eventtime = b.maxtime and a.computername=b.computername;
+----+-------------+------------+-------+-----------------------------------------------------+--------------+---------+--------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+-----------------------------------------------------+--------------+---------+--------------------------+-------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 643 | |
| 1 | PRIMARY | a | ref | eventtime,idxeventtime,idxcomputername,idxcnametime | idxcnametime | 102 | b.computername,b.maxtime | 1 | Using where |
| 2 | DERIVED | logevents | index | NULL | idxcnametime | 102 | NULL | 95282 | Using index |
+----+-------------+------------+-------+-----------------------------------------------------+--------------+---------+--------------------------+-------+-------------+
3 rows in set (0.43 sec)
Where the derived table query doesnt use the range type of query and has to visit (possibly?) all of the rows. I've heard that the explain can be rather murky and not correctly report the amount of rows that would actually be searched.
Aside: is there any reason why the derived query uses a key when there are no possible keys?
Any suggestions/information would be greatly appreciated.
Thanks,
Tom