MySQL Forums
Forum List  »  NDB clusters

data node fail causes all sql nodes to fail
Posted by: Adam Goss
Date: June 04, 2008 12:11PM

I have been testing MySQL with clustering in the following configuration... I have three test systems, all running CentOS5.1. One is a management server, and ALL are running ndbd and mysqld, so effectively the architecture is as follows, and running MySQL 5.0.51a Community from the Downloads area (RHEL 5 on x86):

NoOfReplicas=1

System 1 - Management node, Data Node, SQL Node
System 2 - Data Node, SQL Node
System 3 - Data Node, SQL Node

Given this configuration, I would expect this system to be fully fault-tolerant. Loss of any system (or even 2) should still allow the cluster to function as there is always a Data Node and SQL Node available, and the MySQL 5.0 reference documentation indicates that failure of the Management Node won't impact the cluster. I have verified that with all systems running, the cluster is functional (i.e. all systems show as connected in ndb_mgm), and I have verified that for my clustered DB, data is being replicated amongst all systems.

When testing a network/hardware failure (i.e. unplugging a network cable), at first the whole cluster would fail, but updating the manager's config.ini to include the ndbd option 'StopOnError=false' fixed this. Now when disconnecting a system, the other two data nodes stay active (or for some reason one of the remaining two will crash, but the third will stay alive), but ALL SQL nodes get disconnected. I have been unable to find a configuration option like I did for ndbd and 'StopOnError'. I don't have enough systems at my disposal to test a cluster of two data nodes and two sql nodes on four systems, so is this a bug, or an artifact of my cluster's configuration? Any input would be very much appreciated. For those who have questions, the clustered MySQL database is intended to be used as a High Availability backend to a radius implementation (i.e. 3 systems each running a radiusd instance, and clustering mysql so that user credential info is up-to-date on all three systems), not expected to see massive volume of queries given a user pool of < 250.

Options: ReplyQuote


Subject
Views
Written By
Posted
data node fail causes all sql nodes to fail
2286
June 04, 2008 12:11PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.