MySQL Forums
Forum List  »  NDB clusters

Re: Wait LCP to ensure durability
Posted by: Thomas Waibel-BGo
Date: November 23, 2017 05:06AM

Hi,
is it advisable to force a LCP using
ndb_mgm> ALL DUMP 7099
?
There has not been a LCP for quite a while and the undo space keeps growing...

...
root@infra01.dc1:~# grep "Local checkpoint" /var/lib/mysql-cluster/ndbmgm01/ndb_1_cluster.log
2017-11-20 00:29:59 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5011 completed
2017-11-20 00:30:00 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5012 started. Keep GCI = 22782266 oldest restorable GCI = 22790412
2017-11-20 02:51:34 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5012 completed
2017-11-20 02:51:35 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5013 started. Keep GCI = 22790428 oldest restorable GCI = 22798426
2017-11-20 05:16:36 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5013 completed
2017-11-20 05:16:37 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5014 started. Keep GCI = 22798428 oldest restorable GCI = 22806613
2017-11-20 07:41:17 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5014 completed
2017-11-20 07:41:18 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5015 started. Keep GCI = 22806615 oldest restorable GCI = 22814786
2017-11-20 10:20:56 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5015 completed
2017-11-20 10:20:57 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5016 started. Keep GCI = 22814787 oldest restorable GCI = 22823807
2017-11-20 12:58:41 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5016 completed
2017-11-20 12:58:42 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5017 started. Keep GCI = 22823810 oldest restorable GCI = 22832726
2017-11-20 15:40:53 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5017 completed
2017-11-20 15:40:54 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5018 started. Keep GCI = 22832728 oldest restorable GCI = 22841891
2017-11-20 18:29:08 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5018 completed
2017-11-20 18:29:09 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5019 started. Keep GCI = 22841893 oldest restorable GCI = 22851404
2017-11-20 20:59:30 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5019 completed
2017-11-20 20:59:30 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5020 started. Keep GCI = 22851407 oldest restorable GCI = 22859900
2017-11-20 23:31:42 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5020 completed
2017-11-20 23:31:43 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5021 started. Keep GCI = 22859902 oldest restorable GCI = 22868499
2017-11-21 01:52:35 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5021 completed
2017-11-21 01:52:36 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5022 started. Keep GCI = 22868500 oldest restorable GCI = 22876085
2017-11-21 04:16:59 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5022 completed
2017-11-21 04:16:59 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5023 started. Keep GCI = 22876094 oldest restorable GCI = 22884245
2017-11-21 06:48:00 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5023 completed
2017-11-21 06:48:01 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5024 started. Keep GCI = 22884247 oldest restorable GCI = 22892777
2017-11-21 09:18:45 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5024 completed
2017-11-21 09:18:46 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5025 started. Keep GCI = 22892778 oldest restorable GCI = 22901296
2017-11-21 11:55:10 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5025 completed
2017-11-21 11:55:11 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5026 started. Keep GCI = 22901298 oldest restorable GCI = 22910138
2017-11-21 14:31:44 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5026 completed
2017-11-21 14:31:44 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5027 started. Keep GCI = 22910140 oldest restorable GCI = 22918989
2017-11-21 17:16:40 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5027 completed
2017-11-21 17:16:41 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5028 started. Keep GCI = 22918991 oldest restorable GCI = 22928310
2017-11-21 19:59:48 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5028 completed
2017-11-21 19:59:49 [MgmtSrvr] INFO -- Node 5: Local checkpoint 5029 started. Keep GCI = 22928313 oldest restorable GCI = 22937530
root@infra01.dc1:~#
...

Node 6 crashed on 2017-11-21 22:35 and has not been back up since then. Undo space keeps growing and the restart of the node stops at
2017-11-23 08:39:13 [ndbd] INFO -- LDM(11): We have completed restoring our fragments and executed REDO log and rebuilt ordered indexes

We have 12 LDM per node and only 2 completed - CPU is idle and it seems like nothing is happening on that node...

Options: ReplyQuote


Subject
Views
Written By
Posted
654
September 28, 2017 03:46AM
273
September 28, 2017 03:50AM
355
September 28, 2017 03:17PM
413
October 02, 2017 01:56AM
292
October 02, 2017 06:39AM
259
October 04, 2017 03:49AM
257
November 22, 2017 01:07AM
229
November 22, 2017 01:35AM
209
November 22, 2017 02:23AM
260
November 22, 2017 03:30AM
Re: Wait LCP to ensure durability
199
November 23, 2017 05:06AM
282
November 25, 2017 07:33AM
226
November 27, 2017 04:28AM
202
November 27, 2017 08:44AM
209
November 27, 2017 04:16PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.