Hello everyone!
In an effort to make our MySQL InnoDB Cluster setup more resilient, in particular auto-healing if possible, I am testing various fault scenarios on VMs.
One of the scenarios is cutting power off all the three nodes of the cluster. After the nodes are back, I log to the Shell and issue `dba.rebootClusterFromCompleteOutage()`. The cluster is up again but more often than not one node fails to join it. This is what I see in the command's output:
ERROR: A GTID set check of the MySQL instance at 'instance1' determined that it is missing transactions that were purged from all cluster members
WARNING: es-flow1:3366: RuntimeError: The instance 'instance1' is missing transactions that were purged from all cluster members.
NOTE: Unable to rejoin instance 'instance1' to the cluster but the dba.rebootClusterFromCompleteOutage() operation will continue.
What I see in `Cluster.status()` is:
"instance1": {
"address": "instance1",
"instanceErrors": [
"NOTE: group_replication is stopped."
],
"memberRole": "SECONDARY",
"memberState": "OFFLINE",
"mode": "R/O",
"readReplicas": {},
"role": "HA",
"status": "(MISSING)",
"version": "8.0.24"
},
This sounds like a problem that calls for removing the instance from the cluster and adding it again to force re-provisioning. However, what helps is a plain restart of the service. Is this an expected behavior or is there something I could improve in the configuration of my cluster?
Thanks for your help in advance!