Cluster failure
I've been working on looking at putting a mysql cluster into production and have been trying to benchmark it using sql-bench tests in various setups. I have 4 ndb/api nodes and one management node. I was running mysql-bench from one of the nodes and can consistantly crash the cluster. Each node is a single 2.4Ghz CPU with 512MB of RAM. In the mgm node I get the following in the logs:
(Node 1 is management, 2,3,4,5 are ndb, 6,7,8,9 are mysql)
2005-11-06 15:00:12 [MgmSrvr] WARNING -- Node 5: Node 2 missed heartbeat 2
2005-11-06 15:00:50 [MgmSrvr] WARNING -- Node 5: Node 2 missed heartbeat 2
2005-11-06 15:01:02 [MgmSrvr] WARNING -- Node 5: Node 2 missed heartbeat 2
2005-11-06 15:01:02 [MgmSrvr] WARNING -- Node 5: Node 6 missed heartbeat 2
2005-11-06 15:01:04 [MgmSrvr] WARNING -- Node 5: Node 6 missed heartbeat 3
2005-11-06 15:01:05 [MgmSrvr] INFO -- Node 1: Node 2 Connected
2005-11-06 15:01:10 [MgmSrvr] WARNING -- Node 5: Node 6 missed heartbeat 2
2005-11-06 15:01:11 [MgmSrvr] WARNING -- Node 5: Node 2 missed heartbeat 2
2005-11-06 15:01:19 [MgmSrvr] WARNING -- Node 5: Node 2 missed heartbeat 2
2005-11-06 15:01:23 [MgmSrvr] WARNING -- Node 5: Node 2 missed heartbeat 2
2005-11-06 15:01:24 [MgmSrvr] INFO -- Node 1: Node 2 Connected
2005-11-06 15:01:28 [MgmSrvr] WARNING -- Node 5: Node 2 missed heartbeat 2
2005-11-06 15:01:29 [MgmSrvr] WARNING -- Node 5: Node 2 missed heartbeat 3
2005-11-06 15:01:32 [MgmSrvr] INFO -- Node 1: Node 5 Connected
2005-11-06 15:01:32 [MgmSrvr] ALERT -- Node 5: Forced node shutdown completed. Initiated by signal 0. Caused by error 2305: 'Arbitrator shutdown, please investigate error(s) on other node(s)(Arbitration error). Temporary error, restart node'.
2005-11-06 15:01:45 [MgmSrvr] INFO -- Node 1: Node 2 Connected
2005-11-06 15:01:50 [MgmSrvr] INFO -- Mgmt server state: nodeid 6 reserved for ip 192.168.99.101, m_reserved_nodes 0000000000000042.
2005-11-06 15:01:50 [MgmSrvr] ALERT -- Node 2: Forced node shutdown completed. Initiated by signal 0. Caused by error 2305: 'Arbitrator shutdown, please investigate error(s) on other node(s)(Arbitration error). Temporary error, restart node'.
All nodes are connected with a 100Mbit switch on a seperate LAN segment. I tried switching switches just to make sure my cheap one wasn't being overloaded with the same results. All of the nodes have mysql 5.0.15 running on them. sql-bench completes ATIS and big-tables but crashes on connect. Insert, select, and wisconson then fail because the cluster is dead. My config.ini is:
[NDBD DEFAULT]
NoOfReplicas = 2
DataMemory = 400M
IndexMemory = 200M
MaxNoOfAttributes = 40000
MaxNoOfOrderedIndexes = 10000
MaxNoOfConcurrentOperations = 200000
MaxNoOfTables = 9000
[NDB_MGMD]
hostname = 192.168.99.100
datadir = /var/lib/mysql-cluster
LogDestination = FILE:filename=/var/log/mysql/cluster.log,maxsize=1000000,maxfiles=6
[NDBD]
hostname = 192.168.99.101
datadir = /var/lib/mysql-cluster
[NDBD]
hostname = 192.168.99.102
datadir = /var/lib/mysql-cluster
[NDBD]
hostname = 192.168.99.103
datadir = /var/lib/mysql-cluster
[NDBD]
hostname = 192.168.99.104
datadir = /var/lib/mysql-cluster
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
Are there some things in either the tests or the cluster setup I should be modifying? Should the cluster (maybe properly configured) be able to handle the tests in sql-bench?
Subject
Views
Written By
Posted
Cluster failure
2269
November 06, 2005 03:36PM
1403
November 06, 2005 04:25PM
1626
November 13, 2005 05:59PM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.