MySQL Forums
Forum List  »  NDB clusters

ndb watchdog overslept
Posted by: d g
Date: February 08, 2017 11:44AM

Hi

I upgraded my mysql-cluster to version 7.5.5. Its a testing system and i'm trying to restore a dump from InnoDB and Myisam table, i replaced the engine via regex with ndb and added a tablespace. after a few table the whole cluster crashes with following error:

ERROR 1297 (HY000) at line 1530: Got temporary error 4010 'Node failure caused abort of transaction' from NDBCLUSTER


the log file from node 1 says :

Free extent: [ m_page_no: 7628 m_file_no: 0 m_page_idx: 1599]
2017-02-08 17:28:21 [ndbd] INFO -- DROP_TAB_REQ: tab: 149, tabLcpStatus: 3
Started thread, index = 64, id = 98724, type = NdbfsThread
2017-02-08 17:30:12 [ndbd] INFO -- Locked tid = 98724 to CPU ok
Started thread, index = 65, id = 98725, type = NdbfsThread
2017-02-08 17:30:12 [ndbd] INFO -- Locked tid = 98725 to CPU ok
Started thread, index = 66, id = 98726, type = NdbfsThread
2017-02-08 17:30:12 [ndbd] INFO -- Locked tid = 98726 to CPU ok
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
Started thread, index = 67, id = 98727, type = NdbfsThread
2017-02-08 17:30:12 [ndbd] INFO -- Locked tid = 98727 to CPU ok
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
Started thread, index = 68, id = 98728, type = NdbfsThread
2017-02-08 17:30:53 [ndbd] INFO -- Locked tid = 98728 to CPU ok
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
Started thread, index = 69, id = 98729, type = NdbfsThread
2017-02-08 17:33:17 [ndbd] INFO -- Locked tid = 98729 to CPU ok
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 10: Overslept 1654 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 9: Overslept 1662 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 3: Overslept 1732 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 11: Overslept 1663 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 4: Overslept 1741 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 5: Overslept 1714 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 5: Overslept 1823 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 9: Overslept 2260 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 10: Overslept 2274 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 4: Overslept 1760 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 7: Overslept 2366 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 3: Overslept 1775 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 11: Overslept 1745 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 6: Overslept 2391 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 8: Overslept 1807 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 2: Overslept 2576 ms, expected ~10ms
2017-02-08 17:56:17 [ndbd] WARNING -- thr: 6: Overslept 1711 ms, expected ~10ms
2017-02-08 17:56:17 [ndbd] WARNING -- thr: 4: Overslept 1788 ms, expected ~10ms
2017-02-08 17:56:18 [ndbd] WARNING -- thr: 11: Overslept 2144 ms, expected ~10ms
2017-02-08 17:56:18 [ndbd] WARNING -- thr: 3: Overslept 2164 ms, expected ~10ms
2017-02-08 17:56:18 [ndbd] WARNING -- thr: 2: Overslept 2184 ms, expected ~10ms
2017-02-08 17:56:19 [ndbd] WARNING -- thr: 9: Overslept 1513 ms, expected ~10ms
2017-02-08 18:26:55 [ndbd] INFO -- /export/home2/pb2/build/sb_1-21745070-1483721047.77/rpm/BUILD/mysql-cluster-gpl-7.5.5/mysql-cluster-gpl-7.5.5/storage/ndb/src/kernel/blocks/pgman.cpp
2017-02-08 18:26:55 [ndbd] INFO -- PGMAN (Line: 556) 0x00000004 Check false failed
2017-02-08 18:26:55 [ndbd] INFO -- Error handler restarting system
2017-02-08 18:26:55 [ndbd] INFO -- Error handler shutdown completed - exiting
2017-02-08 18:26:55 [ndbd] ALERT -- Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.


log from node 2:

Free extent: [ m_page_no: 7628 m_file_no: 0 m_page_idx: 1599]
2017-02-08 17:28:21 [ndbd] INFO -- DROP_TAB_REQ: tab: 149, tabLcpStatus: 3
Started thread, index = 64, id = 98724, type = NdbfsThread
2017-02-08 17:30:12 [ndbd] INFO -- Locked tid = 98724 to CPU ok
Started thread, index = 65, id = 98725, type = NdbfsThread
2017-02-08 17:30:12 [ndbd] INFO -- Locked tid = 98725 to CPU ok
Started thread, index = 66, id = 98726, type = NdbfsThread
2017-02-08 17:30:12 [ndbd] INFO -- Locked tid = 98726 to CPU ok
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
Started thread, index = 67, id = 98727, type = NdbfsThread
2017-02-08 17:30:12 [ndbd] INFO -- Locked tid = 98727 to CPU ok
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
Started thread, index = 68, id = 98728, type = NdbfsThread
2017-02-08 17:30:53 [ndbd] INFO -- Locked tid = 98728 to CPU ok
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
Started thread, index = 69, id = 98729, type = NdbfsThread
2017-02-08 17:33:17 [ndbd] INFO -- Locked tid = 98729 to CPU ok
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 10: Overslept 1654 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 9: Overslept 1662 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 3: Overslept 1732 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 11: Overslept 1663 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 4: Overslept 1741 ms, expected ~10ms
2017-02-08 17:55:01 [ndbd] WARNING -- thr: 5: Overslept 1714 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 5: Overslept 1823 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 9: Overslept 2260 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 10: Overslept 2274 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 4: Overslept 1760 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 7: Overslept 2366 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 3: Overslept 1775 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 11: Overslept 1745 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 6: Overslept 2391 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 8: Overslept 1807 ms, expected ~10ms
2017-02-08 17:55:13 [ndbd] WARNING -- thr: 2: Overslept 2576 ms, expected ~10ms
2017-02-08 17:56:17 [ndbd] WARNING -- thr: 6: Overslept 1711 ms, expected ~10ms
2017-02-08 17:56:17 [ndbd] WARNING -- thr: 4: Overslept 1788 ms, expected ~10ms
2017-02-08 17:56:18 [ndbd] WARNING -- thr: 11: Overslept 2144 ms, expected ~10ms
2017-02-08 17:56:18 [ndbd] WARNING -- thr: 3: Overslept 2164 ms, expected ~10ms
2017-02-08 17:56:18 [ndbd] WARNING -- thr: 2: Overslept 2184 ms, expected ~10ms
2017-02-08 17:56:19 [ndbd] WARNING -- thr: 9: Overslept 1513 ms, expected ~10ms
2017-02-08 18:26:55 [ndbd] INFO -- /export/home2/pb2/build/sb_1-21745070-1483721047.77/rpm/BUILD/mysql-cluster-gpl-7.5.5/mysql-cluster-gpl-7.5.5/storage/ndb/src/kernel/blocks/pgman.cpp
2017-02-08 18:26:55 [ndbd] INFO -- PGMAN (Line: 556) 0x00000004 Check false failed
2017-02-08 18:26:55 [ndbd] INFO -- Error handler restarting system
2017-02-08 18:26:55 [ndbd] INFO -- Error handler shutdown completed - exiting
2017-02-08 18:26:55 [ndbd] ALERT -- Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.


all data is lost after the crash see:

Node 3: Data usage is 0%(9013 32K pages of total 3200000)
Node 3: Index usage is 0%(5593 8K pages of total 1280320)
Node 4: Data usage is 0%(9013 32K pages of total 3200000)
Node 4: Index usage is 0%(5593 8K pages of total 1280320)

before the crash usage was 10%.


my config :

[ndbd default]
NoOfReplicas=2
DataMemory=100000M
IndexMemory=10000M
CompressedBackup=true
datadir=/var/lib/mysql-cluster
NoOfFragmentLogParts=10
MaxNoOfConcurrentOperations=5000000
MaxNoOfAttributes=100000
NoOfFragmentLogFiles=32
StopOnError=0
ThreadConfig=ldm={count=10,cpubind=0-4,12-16},tc={count=4,cpubind=6-7,18-19},send={count=1,cpubind=8},recv={count=1,cpubind=20},main={count=1,cpubind=9,21},rep={count=1,cpubind=9,21},io={count=1,cpubind=9,21},watchdog={count=1,cpubind=9,21}

[tcp default]
SendBufferMemory=16M
ReceiveBufferMemory=16M

[ndb_mgmd]
NodeId=1
hostname=172.16.17.11
datadir=/var/lib/mysql-cluster

[ndb_mgmd]
NodeId=2
hostname=172.16.17.12
datadir=/var/lib/mysql-cluster

[ndbd]
hostname=172.16.17.1

[ndbd]
hostname=172.16.17.2

[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[mysqld]


my.cnf on myqsl nodes:

[mysqld]
ndbcluster
bind-address = 0.0.0.0
max-allowed-packet = 128M
ndb-cluster-connection-pool=4
default_storage_engine=ndbcluster
optimizer_switch=engine_condition_pushdown=on

Options: ReplyQuote


Subject
Views
Written By
Posted
ndb watchdog overslept
2433
d g
February 08, 2017 11:44AM
1168
February 09, 2017 03:06PM
1557
d g
February 11, 2017 08:40PM
797
d g
February 14, 2017 06:12AM
885
d g
February 20, 2017 05:17AM
1002
February 21, 2017 04:36AM
874
February 21, 2017 05:27AM
837
d g
February 21, 2017 06:04AM
962
February 21, 2017 08:34AM
896
d g
February 21, 2017 10:38AM
830
d g
February 22, 2017 08:38AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.