Replication stops when changing a merge table
I have a master and slave configured in my production environment. We are seeing about 3000 queries per second on the master and 1000 queries per second on the slave. But I have a problem with the merge tables I'm using for some of my stats information.
Every night we run a roll-up process that summarizes data in the merge tables. At the end of the processing, we do a "FLUSH TABLES; TRUNCATE TABLE stats0" where the table "stats" is a merge table that contains stats0, stats1, and stats2 as its member tables.
This all works fine on the master, but it doesn't work on the slave reliably. We see this error whenever the replication stops:
Slave: Error 'MyISAM table 'stats0' is in use (most likely by a MERGE table). Try FLUSH TABLES.' on query. Default database: 'prod_db'. Query: 'TRUNCATE TABLE stats0', Error_code: 1105
But we don't see this every night. Sometimes it happens, sometime it works. And sometimes the slave SKIPS the "TRUNCATE..." altogether, and we only find out about it when the applications wnats to start writing tot stats0 again, and our primary keys start to clobber.
So far, we have figured out that by the time the "FLUSH TABLES" has been processed by the slave, and before the slave has process the "TRUNCATE..." statement, it re-opens the tables because a read request of some sort comes in. But we're not sure of this.
I think that if I could get those 2 statements into the binlog as a single statement to be executed all at once, we would probably be OK.
We've considered doing an 'ALTER TABLE...' on the merge table to remove the member table that we're working on, do the "TRUNCATE..." and then do another "ALTER TABLE..." to bring the member table back into the merge table. But that seems like a lot of work for something that should cause this problem in the first place.
FWIW, I am running a very optimized system (we had MySQL come out for a performance and tuning visit (I highly recommend the service if you can fit it in your budget). We implemented the merge tables on their suggestion, in addition to their other other recommendations) and have no other problems otherwise. We ARE running the the slave with "low-priority-updates" as part of our performance regimen, which may have bearing on this problem, I don't know.
I would REALLY like to hear from anyone if they have any ideas regarding this issue. I haven't bought my MySQL Nework support yet, but I probably will just so that I can resolve this.