Slow update with two like 'string%' conditions
The following update statement times out:
UPDATE final f, crsp_stocknames q
SET f.permno=q.permno, f.ncusip=q.ncusip, f.namedt=q.namedt, f.nameenddt=q.nameenddt, f.ticker=q.ticker
WHERE f.ncusip is null AND f.symbol LIKE concat(q.ticker,'%') AND f.name LIKE concat(q.comnam,'%') AND (f.datef BETWEEN q.namedt and q.nameenddt);
The objective is to find at the same time partial matches of the symbol and name in the crsp table, (and it has to happen for a given time interval).
The tables are:
CREATE TABLE `final` (
`PK` int(11) NOT NULL AUTO_INCREMENT,
`permno` int(11) DEFAULT NULL,
`cusip` char(8) DEFAULT NULL,
`ncusip` char(8) DEFAULT NULL,
`namedt` int(11) DEFAULT NULL,
`datef` int(11) DEFAULT NULL,
`nameenddt` int(11) DEFAULT NULL,
`symbol` varchar(10) DEFAULT NULL,
`ticker` varchar(10) DEFAULT NULL,
`name` varchar(30) DEFAULT NULL,
PRIMARY KEY (`PK`) KEY_BLOCK_SIZE=8,
KEY `final_cusip` (`cusip`),
KEY `final_symbol` (`symbol`),
KEY `final_datef` (`datef`),
KEY `final_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=65536 DEFAULT CHARSET=utf8
CREATE TABLE `crsp_stocknames` (
`PK` int(10) unsigned NOT NULL AUTO_INCREMENT,
`permno` int(10) DEFAULT NULL,
`permco` int(11) DEFAULT NULL,
`namedt` int(11) DEFAULT NULL,
`nameenddt` int(11) DEFAULT NULL,
`cusip` char(8) DEFAULT NULL,
`ncusip` char(8) DEFAULT NULL,
`ticker` varchar(10) DEFAULT NULL,
`comnam` varchar(40) DEFAULT NULL,
`hexcd` tinyint(4) DEFAULT NULL,
`exchcd` tinyint(4) DEFAULT NULL,
`siccd` smallint(5) unsigned DEFAULT NULL,
`shrcd` tinyint(4) DEFAULT NULL,
`shrcls` char(1) DEFAULT NULL,
`st_date` int(10) unsigned DEFAULT NULL,
`end_date` int(10) unsigned DEFAULT NULL,
`namedum` tinyint(4) DEFAULT NULL,
PRIMARY KEY (`PK`),
UNIQUE KEY `PK_UNIQUE` (`PK`),
KEY `stocknames_ncusip` (`ncusip`),
KEY `stocknames_namedt` (`namedt`),
KEY `stocknames_nameenddt` (`nameenddt`),
FULLTEXT KEY `stocknames_name` (`comnam`),
FULLTEXT KEY `stocknames_ticker` (`ticker`)
) ENGINE=InnoDB AUTO_INCREMENT=55973 DEFAULT CHARSET=utf8
The status:
# Name Engine Version Row_format Rows Avg_row_length Data_length Max_data_length Index_length Data_free Auto_increment Create_time Update_time Check_time Collation Checksum Create_options Comment
crsp_stocknames InnoDB 10 Compact 55711 122 6832128 0 9535488 2097152 55973 2014-03-25 21:35:01 utf8_general_ci
final InnoDB 10 Compact 57402 82 4734976 0 10551296 4194304 65536 2014-03-25 18:29:07 2014-03-25 21:38:39 utf8_general_ci
And the explain of the update:
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, q, , ALL, stocknames_namedt,stocknames_nameenddt, , , , 55711, 100.00,
1, UPDATE, f, , ALL, final_datef, , , , 57402, 100.00, Range checked for each record (index map: 0x8)
Is there something fundamental I am missing? Amd could you explain if the explain is legit, because it says potential keys, but the keys field is empty.
Note that I have not tweaked the .conf
[UPDATE]
The following select runs in 13 seconds:
select f.*
from final f
inner join crsp_stocknames q on f.symbol LIKE concat(q.ticker,'%') AND f.name LIKE concat(q.comnam,'%')
AND (f.datef BETWEEN q.namedt and q.nameenddt)
WHERE f.ncusip is null and q.ticker is not null and q.comnam is not null
Edited 2 time(s). Last edit at 03/26/2014 03:37AM by Oleg Komarov.