MySQL Forums
Forum List  »  NDB clusters

Re: Network Separation with 3 Replicas?
Posted by: Mikael Ronström
Date: August 28, 2020 05:22AM

We added full support for 3-4 replicas in 8.0. This included some
changes to handling of errors.

You are correct in that 3 machines is still sufficient with
3 replicas.

The algorithm now works the following way:
First some preliminaries. Nodes are organised in node groups. A node
group is similar to a shard, data is fully replicated within a node
group and another node group deals with other data.

Node failures are transactional, so we ensure that all nodes see node
failures in the same order.

When a set of node fails we look at the surviving nodes and compare them
with the set of nodes before the node failure.

Then we apply the following logic in order.
1) If any node group has no live nodes then the cluster fails
2) If any node group has no node failures then the cluster survives
3) If a majority of the nodes are still alive after the node failure the
cluster will survive
4) If exactly half of the nodes have failed we will ask the arbitrator if
our half is allowed to continue
5) Cluster failed

So in your case with 3 replicas this means that if 1 node fails we
will always since Rule 1) and Rule 2) are false, but Rule 3) is true.
Thus the arbitrator is not even consulted in this case.

If we have 3 replicas and 1 node is already down and we have another
node failure then Rule 1), 2) and 3) are all false and Rule 4) applies.
Thus in this case the arbitrator will decide whether the cluster survives
or not.

Thus we can survive with even 1 replica remaining if 1 node at a time fails
in the case of 3 replicas.

The 2 failed nodes cannot start a new cluster since a cluster start requires
all nodes to be present. A cluster can only be started with 2 out of 3 nodes
by manually entering which nodes are not part of the start.

Options: ReplyQuote

Written By
Re: Network Separation with 3 Replicas?
August 28, 2020 05:22AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.