I've also posted this problem to the Hyperic HQ forums, as the query being used and schema is from their application.
http://forums.hyperic.com/jiveforums/thread.jspa?messageID=20253伝
But since the problem seems to be odd behavior from MySQL, using an InnoDB storage engine, I thought perhaps I could get more specific help from y'all.
On a dual quadcore Xeon (2.7GhZ) with 12GB RAM (16GB with 4GB going to Xen VM), Red Hat 5 Linux, and using MySQL 5.0.22 (and 5.0.46 on the RHEL 5 VM), I have one query that consumes roughly 30 seconds and appears to just be mysqld spinning at 100% (processlist show "Sending data" state). The query is used in the monitoring application to fetch historical performance data, and it's the only one that appears to be slow. It used to be really quite fast, definitely in the 7 sec or less range for the entire page to be displayed.
I thought perhaps the table had just grown too large, and maybe that's true. Anyway, here's the query:
# Time: 080910 16:55:55
# User@Host: hqadmin[hqadmin] @ localhost [127.0.0.1]
# Query_time: 31 Lock_time: 0 Rows_sent: 14 Rows_examined: 60
use hqdb;
SELECT i FROM EAM_NUMBERS WHERE i < 60 AND EXISTS (SELECT id FROM EAM_EVENT_LOG WHERE timestamp BETWEEN (1220993724472 + (1440000 * i))
AND ((1220993724472 + (1440000 * (i + 1))) - 1) AND entity_id = 10229 AND entity_type = 1 LIMIT 1);
The EAM_NUMBERS table contains a single bigint(20) column, with 60 rows (values 1-60). The EAM_EVENT_LOG table is the big one:
mysql> desc EAM_EVENT_LOG;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| ID | int(11) | NO | PRI | | |
| DETAIL | varchar(500) | NO | | | |
| TYPE | varchar(100) | NO | | | |
| TIMESTAMP | bigint(20) | NO | MUL | | |
| ENTITY_ID | int(11) | YES | | NULL | |
| ENTITY_TYPE | int(11) | YES | | NULL | |
| SUBJECT | varchar(100) | YES | | NULL | |
| STATUS | varchar(100) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
8 rows in set (0.00 sec)
mysql> select count(*) from EAM_EVENT_LOG;
+----------+
| count(*) |
+----------+
| 1098105 |
+----------+
1 row in set (7.76 sec)
mysql> EXPLAIN SELECT i FROM EAM_NUMBERS WHERE i < 60 AND EXISTS (SELECT id FROM EAM_EVENT_LOG WHERE timestamp BETWEEN (1220993724472 + (1440000 * i)) AND ((1220993724472 + (1440000 * (i + 1))) - 1) AND entity_id = 10229 AND entity_type = 1 LIMIT 1);
+----+--------------------+---------------+-------+---------------+---------------+---------+------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------------+-------+---------------+---------------+---------+------+---------+--------------------------+
| 1 | PRIMARY | EAM_NUMBERS | range | PRIMARY | PRIMARY | 8 | NULL | 30 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | EAM_EVENT_LOG | index | NULL | EVENT_LOG_IDX | 18 | NULL | 1188021 | Using where; Using index |
+----+--------------------+---------------+-------+---------------+---------------+---------+------+---------+--------------------------+
The configuration is basically stock RHEL5 rpm config (nothing) along with
http://support.hyperic.com/display/DOC/MySQL+DB+Preparation. I've seen CPU for mysql spike as high as perhaps 246%, but it hangs around 99% for the 30 sec run. By doing a mysqldump --opt of the database (804MB uncompressed), and importing into the 5.0.46 instance in the VM, it took 28, 24 and 24 secs, so I don't think the "30 secs" is a magic number, even though it runs that long fairly often :)
The high cpu spinning is what confused me, but I am an InnoDB n00b. I suppose the whole table could fit in buffer cache (swappiness is still 60, I believe on both systems).
Anyway, I would love any tips or tricks to solve this one. I would love anything, really, but mostly making it fast would be best :)
Thanks all.
- edan