Re: Wait LCP to ensure durability
We have not been able to bring up Node 6 during the week and ran half legged for the whole week.
Undo space kept growing since Local checkpoint 5029 started on 2017-11-21 19:59:49 never finished.
Node 6 crashed on 2017-11-21 22:35.
I executed ALL DUMP 7010,...7011,..7012,...7013 and ...7014 just now
---
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 3: c_lcpState.lcpStatusUpdatedPlace = 21355, cLcpStart = 0
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 3: c_blockCommit = 0, c_blockCommitNo = 11
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 4: c_lcpState.lcpStatusUpdatedPlace = 21355, cLcpStart = 0
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 4: c_blockCommit = 0, c_blockCommitNo = 11
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 5: c_lcpState.lcpStatusUpdatedPlace = 21355, cLcpStart = 0
2017-11-25 09:19:32 [MgmtSrvr] INFO -- Node 5: c_blockCommit = 0, c_blockCommitNo = 11
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_COPY_GCIREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_COPY_TABREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_UPDATE_FRAG_STATEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_DIH_SWITCH_REPLICA_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_EMPTY_LCP_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_GCP_COMMIT_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_GCP_PREPARE_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_GCP_SAVEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_SUB_GCP_COMPLETE_REP_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_INCL_NODEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_MASTER_GCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_MASTER_LCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_START_INFOREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_START_RECREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_STOP_ME_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_TC_CLOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 3: c_TCGETOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_COPY_GCIREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_COPY_TABREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_UPDATE_FRAG_STATEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_DIH_SWITCH_REPLICA_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_EMPTY_LCP_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_GCP_COMMIT_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_GCP_PREPARE_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_GCP_SAVEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_SUB_GCP_COMPLETE_REP_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_INCL_NODEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_MASTER_GCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_MASTER_LCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_START_INFOREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_START_RECREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_STOP_ME_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_TC_CLOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 4: c_TCGETOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_COPY_GCIREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_COPY_TABREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_UPDATE_FRAG_STATEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_DIH_SWITCH_REPLICA_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_EMPTY_LCP_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_GCP_COMMIT_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_GCP_PREPARE_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_GCP_SAVEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_SUB_GCP_COMPLETE_REP_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_INCL_NODEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_MASTER_GCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_MASTER_LCPREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_START_INFOREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_START_RECREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_STOP_ME_REQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_TC_CLOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:19:57 [MgmtSrvr] INFO -- Node 5: c_TCGETOPSIZEREQ_Counter = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: ParticipatingDIH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: ParticipatingLQH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=1 0000000000000008]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_From_Master_Received = 0
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: ParticipatingDIH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: ParticipatingLQH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=1 0000000000000008]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 4: m_LCP_COMPLETE_REP_From_Master_Received = 0
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: ParticipatingDIH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: ParticipatingLQH = 0000000000000038
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=1 0000000000000008]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=1 0000000000000008]
2017-11-25 09:20:02 [MgmtSrvr] INFO -- Node 5: m_LCP_COMPLETE_REP_From_Master_Received = 0
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: -- Node 3 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: lcpStatus = 10 (update place = 21355)
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: lcpStart = 0 lcpStopGcp = 22946293 keepGci = 0 oldestRestorable = 0
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: immediateLcpStart = 0 masterLcpNodeId = 5
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 0 : status: 9 place: 11080
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 1 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 2 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 3 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 4 : status: 0 place: 21736
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 5 : status: 10 place: 21355
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 6 : status: 9 place: 20883
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 7 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 8 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: 9 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 3: -- Node 3 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: -- Node 4 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: lcpStatus = 10 (update place = 21355)
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: lcpStart = 0 lcpStopGcp = 22946293 keepGci = 0 oldestRestorable = 0
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: immediateLcpStart = 0 masterLcpNodeId = 5
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 0 : status: 9 place: 11080
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 1 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 2 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 3 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 4 : status: 0 place: 21736
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 5 : status: 10 place: 21355
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 6 : status: 9 place: 20883
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 7 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 8 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: 9 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 4: -- Node 4 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: -- Node 5 LCP STATE --
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: lcpStatus = 10 (update place = 21355)
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: lcpStart = 0 lcpStopGcp = 22957140 keepGci = 22928313 oldestRestorable = 22937530
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: immediateLcpStart = 1 masterLcpNodeId = 5
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 0 : status: 9 place: 11080
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 1 : status: 8 place: 20348
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 2 : status: 2 place: 18117
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 3 : status: 6 place: 17933
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 4 : status: 5 place: 816
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 5 : status: 5 place: 20195
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 6 : status: 4 place: 20073
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 7 : status: 3 place: 20005
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 8 : status: 7 place: 19989
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: 9 : status: 1 place: 19895
2017-11-25 09:20:08 [MgmtSrvr] INFO -- Node 5: -- Node 5 LCP STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 3: -- Node 3 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 3: c_lcpMasterTakeOverState.state = 0 updatePlace = 23294 failedNodeId = 0
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 3: c_lcpMasterTakeOverState.minTableId = 0 minFragId = 0
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 3: -- Node 3 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 4: -- Node 4 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 4: c_lcpMasterTakeOverState.state = 0 updatePlace = 11537 failedNodeId = 3
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 4: c_lcpMasterTakeOverState.minTableId = 0 minFragId = 0
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 4: -- Node 4 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 5: -- Node 5 LCP MASTER TAKE OVER STATE --
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 5: c_lcpMasterTakeOverState.state = 0 updatePlace = 20366 failedNodeId = 3
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 5: c_lcpMasterTakeOverState.minTableId = 0 minFragId = 0
2017-11-25 09:20:10 [MgmtSrvr] INFO -- Node 5: -- Node 5 LCP MASTER TAKE OVER STATE --
I tried to force a LCP using ALL DUMP 7099, but since LCP 5029 has not finished there is no new LCP.
We took a mysqldump backup since ndb backup used to crash the ndb cluster and are now about to restart all data nodes. We have some tables with ndb_table_no_logging=1 and will loose some data but at least hope to enable LCP by restarting the data nodes.
Subject
Views
Written By
Posted
1756
September 28, 2017 03:46AM
755
September 28, 2017 03:50AM
900
September 28, 2017 03:17PM
1486
October 02, 2017 01:56AM
814
October 02, 2017 06:39AM
798
October 04, 2017 03:49AM
950
November 22, 2017 01:07AM
777
November 22, 2017 01:35AM
726
November 22, 2017 02:23AM
787
November 22, 2017 03:30AM
721
November 23, 2017 05:06AM
Re: Wait LCP to ensure durability
744
November 25, 2017 02:33AM
793
November 25, 2017 04:08AM
778
November 25, 2017 04:16AM
744
November 25, 2017 07:33AM
660
November 25, 2017 11:44AM
688
November 25, 2017 03:37PM
657
November 26, 2017 01:00AM
903
November 26, 2017 08:06AM
670
November 27, 2017 04:28AM
672
November 27, 2017 04:58AM
628
November 27, 2017 08:44AM
741
November 27, 2017 09:16AM
690
November 27, 2017 04:16PM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.