MySQL Forums
Forum List  »  NDB clusters

Impossible to recover from datanode down
Posted by: Jan-willem van Eys
Date: August 16, 2006 04:10AM

I am about to give up on MySQL cluster as a HA solution.

The situation:

First we run into the memory per process limit of 32bit systems. A workaround for this is running more ndbd processes per machine, but when I place a comment on the 5.0 Cluster faq documentation page, it geets revoked, because 'Currently we do not recommend running multiple data node processes per machine.'
Hmm... So maybe the fact that the cluster can't recover from killing the ndbd processes on one of our 2 data node machines is due to the fact that it's not recommended to run this way.

So we go for the 64bit version of RHEL4AS on two machines. We run one ndbd process per machine, as recommended by the docs@mysql.com team.

We run a stress test, about 2500 random select, update, insert, delete queries per second.

Process to wreck a cluster:

* run this test
* killall -9 ndbd on one of the data nodes

The database keeps running without a hitch.

* start the killed node

The node is stuck in startup phase 5 forever (according to the ndb_mgm node).

I can't imagine that nobody tried something like this before, so why are there no bug reports, work arounds, solution, caveats to be found?

All in all this cost us a lot, not only in additional hardware (memory, disks) but also in time, and the conclusion is that MySQL clustering is a nice idea, now we just have to wait until it works, but look somewhere else for stable HA solutions in the mean time.

Options: ReplyQuote


Subject
Views
Written By
Posted
Impossible to recover from datanode down
1764
August 16, 2006 04:10AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.