Re: Wait LCP to ensure durability
Hi Mikael,
thanks for having a look into this on a Saturday! Your response was a little too late - now we are already sweating blood here:
Executed the restart at around 09:39:41. Missed Node 6 there...
Started applying undo logs at around 10:12. We had around 1,3 million pages in the undo space by then to work on...
At 16:03 we got the following error on node 4:
---
2017-11-25 16:03:42 [ndbd] INFO -- LGMAN: Applying Undo log - 827259 pages completed, applied 521100000 records, reached LSN 21410526021
2017-11-25 16:03:43 [ndbd] INFO -- LGMAN: Applying Undo log - 827306 pages completed, applied 521130000 records, reached LSN 21410496021
2017-11-25 16:03:44 [ndbd] INFO -- LGMAN: File undo1.log have wrong pageLSN in page: 32767
2017-11-25 16:03:44 [ndbd] INFO -- Error while reading the datapages and UNDO log
2017-11-25 16:03:44 [ndbd] INFO -- LGMAN (Line: 4426) 0x00000000
2017-11-25 16:03:44 [ndbd] INFO -- Error handler restarting system
2017-11-25 16:03:44 [ndbd] INFO -- Error handler shutdown completed - exiting
2017-11-25 16:04:55 [ndbd] INFO -- Angel detected startup failure, count: 1
2017-11-25 16:04:55 [ndbd] ALERT -- Node 4: Forced node shutdown completed. Occured during startphase 4. Caused by error 2313: 'Error while reading the datapages and UNDO log(Ndbd file system inconsistency error, please report a bug). Ndbd file system error, restart node initial'.
2017-11-25 16:04:55 [ndbd] INFO -- Ndb has terminated (pid 15499) restarting
2017-11-25 16:04:55 [ndbd] INFO -- Angel reconnected to '10.20.56.5:1186'
2017-11-25 16:04:55 [ndbd] INFO -- Angel reallocated nodeid: 4
2017-11-25 16:04:55 [ndbd] INFO -- Angel pid: 15498 started child: 31734
---
All 4 nodes automatically restarted after the forced node shutdown - at least node 6 is now included in the node group...
We do not know if that wrong pageLSN error was just a hickup or what to do if it happens again...
Restoring the whole system (380 Gbyte compressed mysqldump backup) will be a nightmare...
Subject
Views
Written By
Posted
1903
September 28, 2017 03:46AM
810
September 28, 2017 03:50AM
957
September 28, 2017 03:17PM
1580
October 02, 2017 01:56AM
879
October 02, 2017 06:39AM
881
October 04, 2017 03:49AM
1027
November 22, 2017 01:07AM
858
November 22, 2017 01:35AM
829
November 22, 2017 02:23AM
845
November 22, 2017 03:30AM
791
November 23, 2017 05:06AM
830
November 25, 2017 02:33AM
887
November 25, 2017 04:08AM
878
November 25, 2017 04:16AM
818
November 25, 2017 07:33AM
Re: Wait LCP to ensure durability
722
November 25, 2017 11:44AM
761
November 25, 2017 03:37PM
720
November 26, 2017 01:00AM
972
November 26, 2017 08:06AM
742
November 27, 2017 04:28AM
758
November 27, 2017 04:58AM
708
November 27, 2017 08:44AM
815
November 27, 2017 09:16AM
752
November 27, 2017 04:16PM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.