MySQL :: Re: ndb watchdog overslept

New Topic

Re: ndb watchdog overslept

Posted by: d g
Date: February 11, 2017 08:40PM

Hi Mikael,

At first thank you for youre reply, i tried setting DiskPageBufferMemory to 4000M. After setting these the dump aborted by another error wo says i should increase MaxNoOfConcurrentOperations. I ran serval time before in this error i set in config as a first value 500000 for MaxNoOfConcurrentOperations then 5000000 and after setting DiskPageBuffermemory to 4000M i increased it again to 10000000 and after that i run out of sendbuffer so i increased it too to 64M (as well as recievebuffer) and at least i had to increase TransactionDeadlockDetectionTimeout. Now The Cluster ist stopping again with the following error :

Feb 9 12:40:55 ndbdata01 ndbmtd: thr_no:13 - sleeploop 10!! (Worker thread blocked (>= 10ms) by slow consumer threads)
Feb 9 12:40:55 ndbdata01 ndbmtd: 2017-02-09 12:40:55 [ndbd] WARNING -- thr: 9: Overslept 7273 ms, expected ~10ms
Feb 9 12:40:55 ndbdata01 ndbmtd: thr_no:13 - sleeploop 10!! (Worker thread blocked (>= 10ms) by slow consumer threads)
Feb 9 12:40:55 ndbdata01 ndbmtd: thr_no:13 - sleeploop 10!! (Worker thread blocked (>= 10ms) by slow consumer threads)
Feb 9 12:40:55 ndbdata01 ndbmtd: thr_no:13 - sleeploop 10!! (Worker thread blocked (>= 10ms) by slow consumer threads)
Feb 9 12:40:55 ndbdata01 ndbmtd: 2017-02-09 12:40:55 [ndbd] WARNING -- thr: 11: Overslept 4415 ms, expected ~10ms
Feb 9 12:40:55 ndbdata01 ndbmtd: thr_no:13 - sleeploop 10!! (Worker thread blocked (>= 10ms) by slow consumer threads)
Feb 9 12:40:55 ndbdata01 ndbmtd: 2017-02-09 12:40:55 [ndbd] INFO -- /export/home2/pb2/build/sb_1-21745070-1483721047.77/rpm/BUILD/mysql-cluster-gpl-7.5.5/mysql-cluster-gpl-7.5.5/storage/ndb/src/kernel/blocks/pgman.cpp
Feb 9 12:40:55 ndbdata01 ndbmtd: 2017-02-09 12:40:55 [ndbd] INFO -- PGMAN (Line: 556) 0x00000000 Check false failed
Feb 9 12:40:55 ndbdata01 ndbmtd: 2017-02-09 12:40:55 [ndbd] INFO -- Error handler restarting system
Feb 9 12:40:56 ndbdata01 ndbmtd: 2017-02-09 12:40:56 [ndbd] INFO -- Error handler shutdown completed - exiting
Feb 9 12:40:56 ndbdata01 ndbmtd: 2017-02-09 12:40:56 [ndbd] ALERT -- Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

here my modified config:

[ndbd default]
NoOfReplicas=2
DataMemory=90000M
IndexMemory=10000M
DiskPageBufferMemory=4000M
CompressedBackup=true
datadir=/var/lib/mysql-cluster
NoOfFragmentLogParts=10
MaxNoOfConcurrentOperations=10000000
MaxNoOfAttributes=100000
NoOfFragmentLogFiles=32
TimeBetweenLocalCheckpoints=26
TimeBetweenGlobalCheckpoints=10000
MaxDiskWriteSpeed=600M
MinDiskWriteSpeed=200M
MaxDiskWriteSpeedOwnRestart=300M
TransactionDeadlockDetectionTimeout=3000
StopOnError=0
ODirect=1
ThreadConfig=ldm={count=10,cpubind=0-4,12-16},tc={count=4,cpubind=6-7,18-19},send={count=1,cpubind=8},recv={count=1,cpubind=20},main={count=1,cpubind=9,21},rep={count=1,cpubind=9,21},io={count=1,cpubind=9,21},watchdog={count=1,cpubind=9,21}

[tcp default]
SendBufferMemory=64M
ReceiveBufferMemory=64M

[ndb_mgmd]
NodeId=1
hostname=172.16.17.11
datadir=/var/lib/mysql-cluster

[ndb_mgmd]
NodeId=2
hostname=172.16.17.12
datadir=/var/lib/mysql-cluster

[ndbd]
hostname=172.16.17.1

[ndbd]
hostname=172.16.17.2

[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]

it looks like it have something to to with MaxNoOfConcurrentOperations and DiskPageBufferMemory. For testing i also made a downgrade to 7.4.14 but same behavior. My Hardware sould be fast enought 24(12) cores, 128GB Memory, 2500GB Disk space (raid5) 700-800 MB/sec speed. I tested before with 7.4.12 and no problems after i updatted the system to 7.5.5 i noticed this behavior. It seems that onlyone node crashes at this time but if i test with only one node it crashes too with same log entries. Interesting is that the cluser loses again some data after restart not all but the crash happens somewhere between 4 and 6 percent data usage now and after restart data usage is at 2 percent.

errorlog:

Current byte-offset of file-pointer is: 1566

Time: Friday 10 February 2017 - 11:24:56
Status: Temporary error, restart node
Message: Error OS signal received (Internal error, programming error or missing error message, please report a bug)
Error: 6000
Error data: Signal 6 received; Aborted
Error object: /export/home/pb2/build/sb_0-21747926-1483612889.86/rpm/BUILD/mysql-cluster-gpl-7.4.14/mysql-cluster-gpl-7.4.14/storage/ndb/src/kernel/ndbd.cpp
Program: ndbmtd
Pid: 10187 thr: 6
Version: mysql-5.6.35 ndb-7.4.14
Trace: /var/lib/mysql-cluste
Time: Friday 10 February 2017 - 14:27:42
Status: Temporary error, restart node
Message: Error OS signal received (Internal error, programming error or missing error message, please report a bug)
Error: 6000
Error data: Signal 6 received; Aborted
Error object: /export/home/pb2/build/sb_0-21747926-1483612889.86/rpm/BUILD/mysql-cluster-gpl-7.4.14/mysql-cluster-gpl-7.4.14/storage/ndb/src/kernel/ndbd.cpp
Program: ndbmtd
Pid: 10487 thr: 11
Version: mysql-5.6.35 ndb-7.4.14
Trace: /var/lib/mysql-clust
Time: Friday 10 February 2017 - 18:14:51
Status: Temporary error, restart node
Message: Error OS signal received (Internal error, programming error or missing error message, please report a bug)
Error: 6000
Error data: Signal 6 received; Aborted
Error object: /export/home/pb2/build/sb_0-21747926-1483612889.86/rpm/BUILD/mysql-cluster-gpl-7.4.14/mysql-cluster-gpl-7.4.14/storage/ndb/src/kernel/ndbd.cpp
Program: ndbmtd
Pid: 11860 thr: 4
Version: mysql-5.6.35 ndb-7.4.14
Trace: /var/lib/mysql-cluste

Maybe you will see something interesting. I will report a bug will that information. But it would be very helpful if there would be a workaround for this.

Again Thank you for youre time and help.

Regards
Denny

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

ndb watchdog overslept

2743

d g

February 08, 2017 11:44AM

Re: ndb watchdog overslept

1306

Mikael Ronström

February 09, 2017 03:06PM

Re: ndb watchdog overslept

1729

d g

February 11, 2017 08:40PM

Re: ndb watchdog overslept

923

d g

February 14, 2017 06:12AM

Re: ndb watchdog overslept

1012

d g

February 20, 2017 05:17AM

Re: ndb watchdog overslept

1154

Mikael Ronström

February 21, 2017 04:36AM

Re: ndb watchdog overslept

1017

Mikael Ronström

February 21, 2017 05:27AM

Re: ndb watchdog overslept

959

d g

February 21, 2017 06:04AM

Re: ndb watchdog overslept

1107

Mikael Ronström

February 21, 2017 08:34AM

Re: ndb watchdog overslept

1055

d g

February 21, 2017 10:38AM

Re: ndb watchdog overslept

960

d g

February 22, 2017 08:38AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.