MySQL Forums
Forum List  »  Performance

Query not using index
Posted by: Chris O'Brien
Date: May 27, 2009 10:57AM

I am trying to do statistic analysis on a large number of records over a day to see how many transactions were happening in a given time. The table contains 28,601,712 rows which I understand is part of the problem. Each query I run takes ~30 seconds to return, and no matter what I do I cannot get MySQL to use an INDEX to speed things up.

Here is the table I created:
CREATE TABLE flows 
(s_date DATE, s_time TIME, duration FLOAT(8,3), 
proto VARCHAR(4), src_ip VARCHAR(15), src_port VARCHAR(5), 
dst_ip VARCHAR(15), dst_port VARCHAR(4), bytes INT(10), 
idx INT(8) NOT NULL AUTO_INCREMENT, PRIMARY KEY(idx));
DESCRIBE flows;
+----------+-------------+------+-----+---------+----------------+
| Field    | Type        | Null | Key | Default | Extra          |
+----------+-------------+------+-----+---------+----------------+
| s_time   | datetime    | YES  | MUL | NULL    |                |
| e_time   | datetime    | YES  | MUL | NULL    |                |
| duration | float(8,3)  | YES  |     | NULL    |                |
| proto    | varchar(4)  | YES  |     | NULL    |                |
| src_ip   | varchar(15) | YES  |     | NULL    |                |
| src_port | varchar(5)  | YES  |     | NULL    |                |
| dst_ip   | varchar(15) | YES  |     | NULL    |                |
| dst_port | varchar(5)  | YES  |     | NULL    |                |
| bytes    | int(10)     | YES  |     | NULL    |                |
| idx      | int(8)      | NO   | PRI | NULL    | auto_increment |
+----------+-------------+------+-----+---------+----------------+

Here is a sample row:
SELECT * FROM flows WHERE idx="1";
+---------------------+---------------------+----------+-------+----------------+----------+---------------+----------+-------+-----+
| s_time              | e_time              | duration | proto | src_ip         | src_port | dst_ip        | dst_port | bytes | idx |
+---------------------+---------------------+----------+-------+----------------+----------+---------------+----------+-------+-----+
| 2009-04-27 00:00:00 | 2009-04-27 00:00:01 |    1.856 | TCP   |        1.2.3.4 | 62148    |       5.6.7.8 | 80       |   946 |   1 |
+---------------------+---------------------+----------+-------+----------------+----------+---------------+----------+-------+-----+

I tried creating the following index, but it didn't work:
CREATE INDEX time ON flows (s_time,e_time);

I dropped that one and tried creating two separate INDEXes, one for each DATETIME field:
DROP INDEX time ON flows;
CREATE INDEX stime ON flows (s_time);
CREATE INDEX etime ON flows (e_time);
SHOW INDEX FROM flows;

SHOW INDEX FROM flows;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| flows |          0 | PRIMARY  |            1 | idx         | A         |    28601712 |     NULL | NULL   |      | BTREE      |         |
| flows |          1 | stime    |            1 | s_time      | A         |       86149 |     NULL | NULL   | YES  | BTREE      |         |
| flows |          1 | etime    |            1 | e_time      | A         |       86410 |     NULL | NULL   | YES  | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+


That still doesn't get used though.

Here is a sample query I am running. I am trying to find any records that took place over the 1 minute period:
SELECT COUNT(DISTINCT src_ip) FROM flows 
WHERE (s_time <= "2009-04-27 01:00:00" AND e_time >= "2009-04-27 01:00:59") 
OR (s_time >= "2009-04-27 01:00:00" AND e_time <= "2009-04-27 01:00:59") 
OR (s_time >= "2009-04-27 01:00:00" AND s_time <= "2009-04-27 01:00:59" AND e_time >= "2009-04-27 01:00:59") 
OR (s_time <= "2009-04-27 01:00:00" AND e_time >= "2009-04-27 01:00:00" AND e_time <= "2009-04-27 01:00:59") ;
+------------------------+
| COUNT(DISTINCT src_ip) |
+------------------------+
|                   1035 |
+------------------------+


If I use EXPLAIN, it sees the INDEXes I created in both cases but won't use them:
EXPLAIN SELECT COUNT(DISTINCT src_ip) FROM flows WHERE (s_time <= "2009-04-27 01:00:00" AND e_time >= "2009-04-27 01:00:59") OR (s_time >= "2009-04-27 01:00:00" AND e_time <= "2009-04-27 01:00:59") OR (s_time >= "2009-04-27 01:00:00" AND s_time <= "2009-04-27 01:00:59" AND e_time >= "2009-04-27 01:00:59") OR (s_time <= "2009-04-27 01:00:00" AND e_time >= "2009-04-27 01:00:00" AND e_time <= "2009-04-27 01:00:59") ;
+----+-------------+-------+------+---------------+------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows     | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+----------+-------------+
|  1 | SIMPLE      | flows | ALL  | stime,etime   | NULL | NULL    | NULL | 28601712 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+----------+-------------+

I have also tried CAST() and CONVERT() thinking that MySQL was treating the dates I'm entering as strings instead of DATETIME data, but that hasn't helped either. The only thing I could find that would potentially stop indexes from working on DATETIME data was using a function like DATE() on the data I'm comparing to but I'm not doing that. Am I just using too complicated a query for an index to help?
-Chris



Edited 1 time(s). Last edit at 05/28/2009 07:28AM by Chris O'Brien.

Options: ReplyQuote


Subject
Views
Written By
Posted
Query not using index
4100
May 27, 2009 10:57AM
1942
May 27, 2009 08:28PM
1825
May 28, 2009 07:25AM
1768
May 29, 2009 12:32AM
1299
June 04, 2009 12:37PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.