We're running Foreman, and we have a particular query being run by a rake job that results in the following SQL:
DELETE FROM `sources` WHERE (id not IN (SELECT source_id FROM `logs` ))
The "logs" table has 8 million records, and the rake job gets stuck on that DELETE statement and never completes (we've let it run for several hours). We've tried to truncate the table, but the problem is that it grows much too fast. Within less than 48 hours it will be back to that number of records.
I'm trying to figure out if a left join might be faster:
DELETE FROM from sources as s left join logs as l on s.id = l.source_id where l.source_id is NULL;
Are there any other ways I could potentially optimize this query? I've included the table create statements and structure below, along with the explain statements for the two queries.
mysql> show create table sources\G
*************************** 1. row ***************************
Table: sources
Create Table: CREATE TABLE `sources` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`value` text,
`digest` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_sources_on_value` (`value`(300)),
KEY `index_sources_on_digest` (`digest`)
) ENGINE=InnoDB AUTO_INCREMENT=121117 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
mysql> show create table logs\G
*************************** 1. row ***************************
Table: logs
Create Table: CREATE TABLE `logs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`source_id` int(11) DEFAULT NULL,
`message_id` int(11) DEFAULT NULL,
`report_id` int(11) DEFAULT NULL,
`level_id` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_logs_on_report_id` (`report_id`),
KEY `index_logs_on_message_id` (`message_id`),
KEY `index_logs_on_level_id` (`level_id`)
) ENGINE=InnoDB AUTO_INCREMENT=8729466 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
mysql> show table status like 'logs'\G
*************************** 1. row ***************************
Name: logs
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 8235949
Avg_row_length: 56
Data_length: 462340096
Max_data_length: 0
Index_length: 485670912
Data_free: 5242880
Auto_increment: 8729466
Create_time: 2015-04-01 20:07:07
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
mysql> show table status like 'sources'\G
*************************** 1. row ***************************
Name: sources
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 122542
Avg_row_length: 175
Data_length: 21544960
Max_data_length: 0
Index_length: 33865728
Data_free: 6291456
Auto_increment: 121117
Create_time: 2015-11-04 12:58:10
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)
mysql> explain select * from sources WHERE (id not IN (SELECT source_id FROM `logs` ));
+----+-------------+---------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+---------------+------+---------+------+---------+-------------+
| 1 | PRIMARY | sources | ALL | NULL | NULL | NULL | NULL | 122542 | Using where |
| 2 | SUBQUERY | logs | ALL | NULL | NULL | NULL | NULL | 8235949 | NULL |
+----+-------------+---------+------+---------------+------+---------+------+---------+-------------+
2 rows in set (0.00 sec)
mysql> explain extended select s.* from sources as s left join logs as l on s.id = l.source_id where l.source_id is NULL;
+----+-------------+-------+------+---------------+------+---------+------+---------+----------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+------+---------+------+---------+----------+----------------------------------------------------+
| 1 | SIMPLE | s | ALL | NULL | NULL | NULL | NULL | 122542 | 100.00 | NULL |
| 1 | SIMPLE | l | ALL | NULL | NULL | NULL | NULL | 8235949 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+---------+----------+----------------------------------------------------+