Re: Wait LCP to ensure durability
We have not been able to bring up Node 6 during the week and ran half legged for the whole week.
Undo space kept growing since Local checkpoint 5029 started on 2017-11-21 19:59:49 never finished.
Node 6 crashed on 2017-11-21 22:35.
I executed ALL DUMP 7010,...7011,..7012,...7013 and ...7014 just now
---
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 3: c_lcpState.lcpStatusUpdatedPlace = 21355, cLcpStart = 0
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 3: c_blockCommit = 0, c_blockCommitNo = 11
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 4: c_lcpState.lcpStatusUpdatedPlace = 21355, cLcpStart = 0
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 4: c_blockCommit = 0, c_blockCommitNo = 11
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 5: c_lcpState.lcpStatusUpdatedPlace = 21355, cLcpStart = 0
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 5: c_blockCommit = 0, c_blockCommitNo = 11
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_COPY_GCIREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_COPY_TABREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_UPDATE_FRAG_STATEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_DIH_SWITCH_REPLICA_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_EMPTY_LCP_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_GCP_COMMIT_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_GCP_PREPARE_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_GCP_SAVEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_SUB_GCP_COMPLETE_REP_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_INCL_NODEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_MASTER_GCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_MASTER_LCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_START_INFOREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_START_RECREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_STOP_ME_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_TC_CLOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_TCGETOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_COPY_GCIREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_COPY_TABREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_UPDATE_FRAG_STATEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_DIH_SWITCH_REPLICA_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_EMPTY_LCP_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_GCP_COMMIT_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_GCP_PREPARE_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_GCP_SAVEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_SUB_GCP_COMPLETE_REP_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_INCL_NODEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_MASTER_GCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_MASTER_LCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_START_INFOREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_START_RECREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_STOP_ME_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_TC_CLOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_TCGETOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_COPY_GCIREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_COPY_TABREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_UPDATE_FRAG_STATEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_DIH_SWITCH_REPLICA_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_EMPTY_LCP_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_GCP_COMMIT_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_GCP_PREPARE_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_GCP_SAVEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_SUB_GCP_COMPLETE_REP_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_INCL_NODEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_MASTER_GCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_MASTER_LCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_START_INFOREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_START_RECREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_STOP_ME_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_TC_CLOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_TCGETOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: ParticipatingDIH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: ParticipatingLQH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=1 0000000000000008]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_From_Master_Received = 0
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: ParticipatingDIH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: ParticipatingLQH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=1 0000000000000008]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: m_LCP_COMPLETE_REP_From_Master_Received = 0
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: ParticipatingDIH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: ParticipatingLQH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=1 0000000000000008]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=1 0000000000000008]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: m_LCP_COMPLETE_REP_From_Master_Received = 0
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: -- Node 3 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: lcpStatus = 10 (update place = 21355)
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: lcpStart = 0 lcpStopGcp = 22946293 keepGci = 0 oldestRestorable = 0
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: immediateLcpStart = 0 masterLcpNodeId = 5
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 0 : status: 9 place: 11080
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 1 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 2 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 3 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 4 : status: 0 place: 21736
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 5 : status: 10 place: 21355
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 6 : status: 9 place: 20883
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 7 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 8 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 9 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: -- Node 3 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: -- Node 4 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: lcpStatus = 10 (update place = 21355)
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: lcpStart = 0 lcpStopGcp = 22946293 keepGci = 0 oldestRestorable = 0
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: immediateLcpStart = 0 masterLcpNodeId = 5
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 0 : status: 9 place: 11080
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 1 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 2 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 3 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 4 : status: 0 place: 21736
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 5 : status: 10 place: 21355
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 6 : status: 9 place: 20883
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 7 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 8 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 9 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: -- Node 4 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: -- Node 5 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: lcpStatus = 10 (update place = 21355)
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: lcpStart = 0 lcpStopGcp = 22957140 keepGci = 22928313 oldestRestorable = 22937530
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: immediateLcpStart = 1 masterLcpNodeId = 5
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 0 : status: 9 place: 11080
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 1 : status: 8 place: 20348
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 2 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 3 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 4 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 5 : status: 5 place: 20195
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 6 : status: 4 place: 20073
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 7 : status: 3 place: 20005
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 8 : status: 7 place: 19989
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 9 : status: 1 place: 19895
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: -- Node 5 LCP STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 3: -- Node 3 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 3: c_lcpMasterTakeOverState.state = 0 updatePlace = 23294 failedNodeId = 0
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 3: c_lcpMasterTakeOverState.minTableId = 0 minFragId = 0
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 3: -- Node 3 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 4: -- Node 4 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 4: c_lcpMasterTakeOverState.state = 0 updatePlace = 11537 failedNodeId = 3
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 4: c_lcpMasterTakeOverState.minTableId = 0 minFragId = 0
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 4: -- Node 4 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 5: -- Node 5 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 5: c_lcpMasterTakeOverState.state = 0 updatePlace = 20366 failedNodeId = 3
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 5: c_lcpMasterTakeOverState.minTableId = 0 minFragId = 0
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 5: -- Node 5 LCP MASTER TAKE OVER STATE --
I tried to force a LCP using ALL DUMP 7099, but since LCP 5029 has not finished there is no new LCP.
We took a mysqldump backup since ndb backup used to crash the ndb cluster and are now about to restart all data nodes. We have some tables with ndb_table_no_logging=1 and will loose some data but at least hope to enable LCP by restarting the data nodes.
Subject
Views
Written By
Posted
1930
September 28, 2017 03:46AM
826
September 28, 2017 03:50AM
978
September 28, 2017 03:17PM
1603
October 02, 2017 01:56AM
898
October 02, 2017 06:39AM
895
October 04, 2017 03:49AM
1046
November 22, 2017 01:07AM
878
November 22, 2017 01:35AM
845
November 22, 2017 02:23AM
860
November 22, 2017 03:30AM
806
November 23, 2017 05:06AM
Re: Wait LCP to ensure durability
844
November 25, 2017 02:33AM
899
November 25, 2017 04:08AM
896
November 25, 2017 04:16AM
841
November 25, 2017 07:33AM
736
November 25, 2017 11:44AM
771
November 25, 2017 03:37PM
736
November 26, 2017 01:00AM
993
November 26, 2017 08:06AM
767
November 27, 2017 04:28AM
773
November 27, 2017 04:58AM
728
November 27, 2017 08:44AM
829
November 27, 2017 09:16AM
767
November 27, 2017 04:16PM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.