Help optimizing a SELECT WHERE GROUP BY query
Posted by:
Rob Blick
Date: August 25, 2005 01:41PM
Hey all-
I have a query I'm trying to optimize, and I'm not getting anywhere:
mysql> describe gene2word;
+---------------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+----------+------+-----+---------+-------+
| locuslink_id | char(12) | | PRI | | |
| tax_id | char(12) | | PRI | | |
| word_id | char(12) | | PRI | | |
| num_genes | int(11) | | | 0 | |
| num_pmids | int(11) | | | 0 | |
| doc_frequency | int(11) | | | 0 | |
| col_frequency | int(11) | | | 0 | |
+---------------+----------+------+-----+---------+-------+
Query is:
SELECT word_id,tax_id, COUNT(*), num_genes
FROM gene2word
WHERE locuslink_id in ("7428", "7422", "112399", "7157", "317", etc. - up to 2500 of these)
GROUP BY word_id, tax_id;
Explain output (with only the PK index) shows:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: gene2word
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 9505081
Extra: Using where; Using temporary; Using filesort
1 row in set (0.50 sec)
I tried creating an index on (word_id, tax_id, locuslink_id, num_genes) and I still get this ugly explain output:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: gene2word
type: index
possible_keys: PRIMARY
key: idxTmp
key_len: 40
ref: NULL
rows: 9505081
Extra: Using where; Using index
1 row in set (0.52 sec)
Strangely, even though the index is being used, the whole table is still being examined. I'm hoping I'm overlooking something - does anyone have any insight? Thanks!