Urgent help needed - new slave is causing trouble, blocking our whole cluster
Hello everyone,
we have an acute problem with our replication scenario at the moment, showing very odd behavior on a new slave we just set up.
The setting: We have one DB master and, since today, three DB Slaves (we added one today - the one which is causing troubles). The website is a large, highly-accessed web community with some MyISAM and some InnoDB tables.
When we put the new slave (dual xeon 4 GB RAM, Fedora core 2, just like all the other slaves that are running perfectly) into our rotation, it blocks out, leaving simple SELECT queries in the process list for minutes, queueing up, eating up the connection limits. And - oddly enough - causing the MASTER to block as well with rising simultaneous connections.
This is how we set up the new slave: We didn't, as usual, stop the whole cluster and copied the master's database onto the slave. This time we just took one of the slaves out of the cluster and copied its data onto the new machine, staring the replication. The synchronization is running completely normal, no lags. But as soon as you give a little read load on the new slave, it goes crazy.
We've had a similar problem one day with a high load of unauthenticated connections, blocking the cluster, which was caused by some DNS problems. But this time it must be a completely different problem. Same MySQL versions (4.1.11) on all machines. (We also tried MySQL 4.1.15/glibc2.3 on that slave; did no change.)
Any help highly appreciated.
Regards,
Julien
Hamburg