Node 6: Forced node shutdown completed. Occurred during startphase 5. Caused by error 2341
Hi all,
I run into this error when I try to add third data node to the cluster. The cluster works with 1 Mgmt node, 2 data nodes and 2 SQL nodes.
I added new IP addresses for 2 new data nodes in the config.ini on the Management server and did a rolling restart of the existing data nodes and the SQL nodes. Fine until this point, but when I start the third data node with ndbd --initial command it fails with the following message:
ndb_mgm> Node 6: Forced node shutdown completed. Occurred during startphase 5. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
ndb_6_error.log
===============
Time: Friday 5 March 2021 - 16:00:00
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: tsman.cpp
Error object: TSMAN (Line: 2984) 0x00000006 Check m_lcp_ongoing failed
Program: ndbd
Pid: 1106
Version: mysql-8.0.23 ndb-8.0.23
Trace file name: ndb_6_trace.log.2
Trace file path: /var/lib/mysql-cluster/ndb_6_trace.log.2 [t1..t1]
***EOM***
ndb_6_trace.log.2(partial)
==========================
--------------- Signal ----------------
r.bn: 259 "TSMAN", r.proc: 6, r.sigId: 230621 gsn: 244 "END_LCPREQ" prio: 0
s.bn: 247 "DBLQH", s.proc: 6, s.sigId: 230620 length: 4 trace: 0 #sec: 0 fragInf: 0
senderData: 4, senderRef: f70006, backupPtr: 0, backupId: 4
proxyBlockNo: 2064824224
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 6, r.sigId: 230620 gsn: 243 "END_LCPCONF" prio: 0
s.bn: 244 "BACKUP", s.proc: 6, s.sigId: 230619 length: 2 trace: 0 #sec: 0 fragInf: 0
senderData: 4, senderRef: f40006
--------------- Signal ----------------
r.bn: 244 "BACKUP", r.proc: 6, r.sigId: 230619 gsn: 787 "SYNC_EXTENT_PAGES_CONF" prio: 1
s.bn: 261 "PGMAN", s.proc: 6, s.sigId: 230616 length: 2 trace: 0 #sec: 0 fragInf: 0
H'00000000 H'01050006
--------------- Signal ----------------
r.bn: 261 "PGMAN", r.proc: 6, r.sigId: 230618 gsn: 164 "CONTINUEB" prio: 1
s.bn: 261 "PGMAN", s.proc: 6, s.sigId: 230616 length: 1 trace: 0 #sec: 0 fragInf: 0
H'00000003
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 6, r.sigId: 230617 gsn: 616 "LCP_STATUS_CONF" prio: 1
s.bn: 244 "BACKUP", s.proc: 6, s.sigId: 230615 length: 12 trace: 0 #sec: 0 fragInf: 0
SenderRef : f40006 SenderData : 1 LcpState : 14 tableId : 4294967295 fragId : 4294967295
replica(Progress : 4294967295), lcpDone (Rows : 0, Bytes : 0)
lcpScannedPages : 0
--------------- Signal ----------------
r.bn: 261 "PGMAN", r.proc: 6, r.sigId: 230616 gsn: 786 "SYNC_EXTENT_PAGES_REQ" prio: 1
s.bn: 244 "BACKUP", s.proc: 6, s.sigId: 230613 length: 3 trace: 0 #sec: 0 fragInf: 0
H'00000000 H'00f40006 H'00000003
--------------- Signal ----------------
r.bn: 244 "BACKUP", r.proc: 6, r.sigId: 230615 gsn: 615 "LCP_STATUS_REQ" prio: 1
s.bn: 247 "DBLQH", s.proc: 6, s.sigId: 230612 length: 2 trace: 0 #sec: 0 fragInf: 0
SenderRef : f70006 SenderData : 1
--------------- Signal ----------------
r.bn: 251 "NDBCNTR", r.proc: 6, r.sigId: 230614 gsn: 812 "START_DISTRIBUTED_LCP_ORD" prio: 1
s.bn: 247 "DBLQH", s.proc: 6, s.sigId: 230612 length: 1 trace: 0 #sec: 0 fragInf: 0
H'00000004
--------------- Signal ----------------
r.bn: 244 "BACKUP", r.proc: 6, r.sigId: 230613 gsn: 244 "END_LCPREQ" prio: 0
s.bn: 247 "DBLQH", s.proc: 6, s.sigId: 230612 length: 4 trace: 0 #sec: 0 fragInf: 0
senderData: 4, senderRef: f70006, backupPtr: 0, backupId: 4
proxyBlockNo: 0
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 6, r.sigId: 230612 gsn: 365 "LCP_FRAG_ORD" prio: 1
s.bn: 246 "DBDIH", s.proc: 4, s.sigId: 2153169 length: 6 trace: 0 #sec: 0 fragInf: 0
LcpId: 4 LcpNo: 0 Table: -256 Fragment: 0
KeepGCI: 765 LastFragmentFlag: 1
--------------- Signal ----------------
r.bn: 265 "THRMAN", r.proc: 6, r.sigId: 230611 gsn: 849 "UPDATE_THR_LOAD_ORD" prio: 1
s.bn: 265 "THRMAN", s.proc: 6, s.sigId: 230610 length: 3 trace: 0 #sec: 0 fragInf: 0
H'00000010 H'00000000 H'00000000
Can someone help please? I can't add more than 2 data nodes due to this issue and it is holding us back testing with 4 or 6 data nodes.