MySQL :: Group replication stopped on one node after rebootClusterFromCompleteOutage

New Topic

Group replication stopped on one node after rebootClusterFromCompleteOutage

Posted by: Maciej Pazurkiewicz
Date: January 19, 2022 09:24AM

Hello everyone!

In an effort to make our MySQL InnoDB Cluster setup more resilient, in particular auto-healing if possible, I am testing various fault scenarios on VMs.

One of the scenarios is cutting power off all the three nodes of the cluster. After the nodes are back, I log to the Shell and issue `dba.rebootClusterFromCompleteOutage()`. The cluster is up again but more often than not one node fails to join it. This is what I see in the command's output:

ERROR: A GTID set check of the MySQL instance at 'instance1' determined that it is missing transactions that were purged from all cluster members
WARNING: es-flow1:3366: RuntimeError: The instance 'instance1' is missing transactions that were purged from all cluster members.
NOTE: Unable to rejoin instance 'instance1' to the cluster but the dba.rebootClusterFromCompleteOutage() operation will continue.

What I see in `Cluster.status()` is:

"instance1": {
    "address": "instance1", 
    "instanceErrors": [
        "NOTE: group_replication is stopped."
    ], 
    "memberRole": "SECONDARY", 
    "memberState": "OFFLINE", 
    "mode": "R/O", 
    "readReplicas": {}, 
    "role": "HA", 
    "status": "(MISSING)", 
    "version": "8.0.24"
},

This sounds like a problem that calls for removing the instance from the cluster and adding it again to force re-provisioning. However, what helps is a plain restart of the service. Is this an expected behavior or is there something I could improve in the configuration of my cluster?

Thanks for your help in advance!

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

Group replication stopped on one node after rebootClusterFromCompleteOutage

4914

Maciej Pazurkiewicz

January 19, 2022 09:24AM

Re: Group replication stopped on one node after rebootClusterFromCompleteOutage

2172

Frederic Descamps

January 19, 2022 11:58AM

Re: Group replication stopped on one node after rebootClusterFromCompleteOutage

1576

Maciej Pazurkiewicz

January 21, 2022 02:07AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.