MySQL Forums
Forum List  »  NDB clusters

MySQL Arbitration question
Posted by: Sergey F
Date: November 10, 2015 10:38AM

Hi to all.
Our company was very interested by MySQL cluster system and we started testing environment to check how HA solution will work.

So, we built next system (all hosts are XEN-based HVM hosts with CentOS 7 and MySQL Cluster 7.4.7 built from sources).

Node1 (datanode with nodeid 10) IP 192.168.4.61
Node2 (datanode with nodeid 11) IP 192.168.4.62
Node3 (management with nodeid 1) IP 192.168.4.63

All nodes started well and we started to test failures (kill ndbmtd on datanodes, kill ndb_mgmd on management, down interfaces, etc).

We found strange situation:

Starting state of MySQL cluster:

[ndbd(NDB)] 2 node(s)
id=10 @192.168.4.61 (mysql-5.6.25 ndb-7.4.7, Nodegroup: 0, *)
id=11 @192.168.4.62 (mysql-5.6.25 ndb-7.4.7, Nodegroup: 0)
[ndb_mgmd(MGM)] 1 node(s)
id=1 @192.168.4.63 (mysql-5.6.25 ndb-7.4.7)

After that we close by firewall all incoming traffic on node11
iptables -A INPUT -s 192.168.4.0/24 -j DROP

And right after that we got shutdown of cluster

Error log from node11:

Time: Tuesday 10 November 2015 - 18:30:50
Status: Temporary error, restart node
Message: Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s) (Arbitration error)
Error: 2305
Error data: Arbitrator decided to shutdown this node
Error object: QMGR (Line: 6235) 0x00000002
Program: ndbmtd
Pid: 3196 thr: 0
Version: mysql-5.6.25 ndb-7.4.7
Trace: /var/lib/mysql-cluster/ndb_11_trace.log.5 [t1..t4]



Error log from node10:

Time: Tuesday 10 November 2015 - 18:30:50
Status: Temporary error, restart node
Message: Node declared dead. See error log for details (Arbitration error)
Error: 2315
Error data: We(10) have been declared dead by 11 (via 11) reason: Heartbeat failure(4)
Error object: QMGR (Line: 4210) 0x00000002
Program: ndbmtd
Pid: 3479 thr: 0
Version: mysql-5.6.25 ndb-7.4.7
Trace: /var/lib/mysql-cluster/ndb_10_trace.log.8 [t1..t4]

We guess that it may be bug in cluster logic as we got full cluster failure.
Can anybody comment this situation and maybe suggest how to improve high availability?
With best regards

Options: ReplyQuote


Subject
Views
Written By
Posted
MySQL Arbitration question
1643
November 10, 2015 10:38AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.