Re: Which server hardware would you pick?
Wow, some great leads Rick. I had previously done research into locking, slow queries, CPU utilization and disk I/O. No problems anywhere, it looks like both servers are not being pushed at all - idle really.
Here are some results to your suggestions.
> Do SHOW VARIABLES on each server, then diff the outputs. Report the diffs; we can discuss them. innodb_flush_log_at_trx_commit, if different, may be the cause.
There are a surprising number of differences (330 lines worth) because of different OS, different paths etc.
The innodb_flush_log_at_trx does not show in the diff, it is 2 on both servers.
> Do you delete all 2M rows?
All the rows would be deleted over a 7 day period. I have a script that runs every minute and will delete a max. of 500 rows per minute.
> Please show us the DELETE statement and the SHOW CREATE TABLE for the two tables.
I delete like this:
delete from tblquarantine
where msgdate <= date_sub(now(), interval 7 day)
order by quarid asc
limit 500;
and then a two part thing:
select min(msgid) msgid from tblquarantine;
delete from tblmsgs
where msgid < {msgid}
limit 500
Here are the creates:
CREATE TABLE `tblquarantine` (
`quarid` int(11) NOT NULL AUTO_INCREMENT,
`domainid` decimal(11,0) NOT NULL DEFAULT '0',
`emailid` decimal(10,0) NOT NULL DEFAULT '0',
`emailfrom` varchar(150) DEFAULT NULL,
`emailto` varchar(150) DEFAULT NULL,
`subject` varchar(250) DEFAULT NULL,
`msgdate` datetime DEFAULT NULL,
`msgid` int(11) DEFAULT NULL,
`rejectid` int(11) DEFAULT NULL,
`rejectdetails` varchar(200) DEFAULT '',
`deliver` tinyint(1) DEFAULT NULL,
`expire` tinyint(1) DEFAULT NULL,
`serverid` int(11) DEFAULT NULL,
`mx_senderip` char(15) DEFAULT NULL,
PRIMARY KEY (`quarid`),
KEY `msgid` (`msgid`),
KEY `domainid-msgdate-emailid` (`domainid`,`msgdate`,`emailid`)
) ENGINE=InnoDB AUTO_INCREMENT=346952671 DEFAULT CHARSET=utf8;
CREATE TABLE `tblmsgs` (
`msgid` int(11) NOT NULL AUTO_INCREMENT,
`msg` longtext,
PRIMARY KEY (`msgid`)
) ENGINE=InnoDB AUTO_INCREMENT=338021167 DEFAULT CHARSET=utf8;
Name: tblquarantine
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 1190770
Avg_row_length: 309
Data_length: 368082944
Max_data_length: 0
Index_length: 123961344
Data_free: 389021696
Auto_increment: 346975366
Collation: utf8_general_ci
Name: tblmsgs
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 1275897
Avg_row_length: 44798
Data_length: 57157894144
Max_data_length: 0
Index_length: 0
Data_free: 54043607040
Auto_increment: 338038873
Collation: utf8_general_ci
> Disk 1: Intel SSD 330 Series 120GB (for OS)
BINGO! This certainly may play a role. The temp drive on server #1 is on the SSD!
datadir: D:\MySQL Data\
innodb_data_home_dir: D:\MySQL Datafiles\
slave_load_tmpdir: C:\Windows\TEMP
tmpdir: C:\Windows\TEMP
> SHOW VARIABLES LIKE 'binlog_format';
The binlog_format is MIXED, I rely on replication to do the delete on the other master. Normally I do everything on one master, all write, all read, all delete - everything. The other server is just like a hot standby, the clients will actually switch automatically if they can't reach the primary for any reason.
When I have been having the problems with server #2, watching the activity it is always these two tables that seem to have the hangup. The odd thing is that it impacts other queries as well. If it was a locking issue you would think a query on a different table (or even database) would not be impacted.
I will move the temp folder to the spinning disk and see how much difference it makes to that server. It will not fix the issue on server #2 but may be a clue.