MySQL Forums
Forum List  »  Replication

Replication deadlock
Posted by: Tony Nguyen
Date: February 17, 2005 03:17PM

Hello,

I have one master server and several slave servers, each running MySQL 4.1.9 RPMs that I've downloaded from mysql.com. The master's OS is Fedora Core 3 and the slaves are running RH 7.3. I have checked that all slaves have different server-ids associated with them. Finally, all tables are of type MyISAM.

Every once in a while, the read requests being sent to a slave will block on an update statement sent from the master server. The update is very simple: it decrements an account balance's value by a specified amount of money. When this occurs, the system is deadlocked. All new read queries get blocked, and the update statement read by the slave's SQL thread seems to run forever. I have to do a hard reset of the mysql server processes (using kill -9) in order to get things going again. Running "/sbin/service mysql stop" will eventually time out and the admin script give up. I've checked the error log, and nothing seems out of the ordinary.

Here's what a 'mysqladmin processlist' shows on a working slave server:

| 1 | system user | | | Connect | 2593 | Waiting for master to send event
| 2 | system user | | | Connect | -297 | Has read all relay log; waiting for the slave I/O thread

First of all, is that negative value in the 'Time' column suppose to be negative? The value always seems to alternate between '-297' and '-296' on all of my slave servers when they are working properly.

Here's what a processlist shows on a deadlocked slave server:

| 1 | system user | | advertisers | Connect | 1665 | Locked | UPDATE advertisers SET account = account - 0.01 WHERE advertiser_id = 676 |
| 2 | system user | | | Connect | 3917 | Waiting for master to send event |

Notice how ID 2 in this example is now waiting for the master to send and event, indicating that it (ID 2) is now the I/O thread, whereas it was the SQL thread in the first processlist. This seems odd to me of course...

Here is a slave status on a deadlocked instance:

mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: findology18
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: findology18-bin.000002
Read_Master_Log_Pos: 229247481
Relay_Log_File: meta12-e0-relay-bin.000008
Relay_Log_Pos: 4307657
Relay_Master_Log_File: findology18-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table: advertisers.clicks
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: advertisers.clicks_by_%
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 223856891
Relay_Log_Space: 9698247
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 2341
1 row in set (0.00 sec)

-------------------

and here is one on a working slave:

mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: findology18
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: findology18-bin.000002
Read_Master_Log_Pos: 229569024
Relay_Log_File: meta11-e0-relay-bin.000008
Relay_Log_Pos: 60483200
Relay_Master_Log_File: findology18-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table: advertisers.clicks
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: advertisers.clicks_by_%
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 229569024
Relay_Log_Space: 60483200
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
1 row in set (0.00 sec)

Any help is greatly appreciated!

Thanks,
Tony

Options: ReplyQuote


Subject
Views
Written By
Posted
Replication deadlock
3442
February 17, 2005 03:17PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.