MySQL Forums
Forum List  »  NDB clusters

Problem with simple cluster?
Posted by: Michael R
Date: June 23, 2005 04:04PM

I seem to be having a problem with a very simple four-node cluster. Apparently the ndb_mgmd host REALLY does not like the second DBD host. Although the first DBD host can connect and the MySQL host can as well.

What actually seems to happen is that when I run ndbd they all connect up fine and everything is happy. Start mysqld on the first data node, everything's great. Then when I goto start mysqld on the second one it borks and dies. It actually seems to be rejected by the mgmd node, although I can't figure out why. In fact, I can even start it first, and the same thing will happen.

Here's my setup (All running 4.1.12, compiled with same opts, etc):

10.1.1.10: FBSD 5.3-Stable - Management Node
10.1.1.11: FBSD 5.3-Release - Data Node 1
10.1.1.12: FBSD 5.3-Release - Data Node 2
10.1.1.13: FBSD 4.9-Release - Mysql Node

cluster.ini file:

[ndbd default]
NoOfReplicas=2
DataMemory=80M
IndexMemory=52M
MaxNoOfConcurrentOperations=10000
MaxNoOfOrderedIndexes=512
TimeBetweenWatchDogCheck=30000

[tcp default]
PortNumber=2202

[ndb_mgmd]
Id=1
HostName=10.1.1.10

[ndbd]
Id=2
HostName=10.1.1.11

[ndbd]
Id=3
HostName=10.1.1.12

[mysqld]
Id=4
HostName=10.1.1.13

------
my.cnf file (on data1/2, mysql):

[mysqld]
ndbcluster
ndb-connectstring=10.1.1.10
datadir=/var/db/mysql/data

[mysql_cluster]
ndb-connectstring=10.1.1.10

-----

Logfile of debug level output from mgmd host (Didn't bother connecting up the mysql node on this one):

2005-06-23 14:47:01 [MgmSrvr] INFO -- NDB Cluster Management Server. Version 4.1.12
2005-06-23 14:47:01 [MgmSrvr] INFO -- Id: 1, Command port: 1186
2005-06-23 14:47:14 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 reserved for ip 10.1.1.11, m_reserved_nodes 0000000000000006.
2005-06-23 14:47:14 [MgmSrvr] INFO -- Node 1: Node 2 Connected
2005-06-23 14:47:42 [MgmSrvr] INFO -- Mgmt server state: nodeid 3 reserved for ip 10.1.1.12, m_reserved_nodes 000000000000000e.
2005-06-23 14:47:42 [MgmSrvr] INFO -- Node 1: Node 3 Connected
2005-06-23 14:47:42 [MgmSrvr] INFO -- Node 2: Node 3 Connected
2005-06-23 14:47:42 [MgmSrvr] INFO -- Node 2: Start phase 1 completed
2005-06-23 14:47:45 [MgmSrvr] INFO -- Node 3: CM_REGCONF president = 2, own Node = 3, our dynamic id = 2
2005-06-23 14:47:45 [MgmSrvr] INFO -- Node 2: Node 3: API version 4.1.12
2005-06-23 14:47:45 [MgmSrvr] INFO -- Node 3: Node 2: API version 4.1.12
2005-06-23 14:47:45 [MgmSrvr] INFO -- Node 3: Start phase 1 completed
2005-06-23 14:47:45 [MgmSrvr] INFO -- Node 2: Start phase 2 completed (initial start)
2005-06-23 14:47:45 [MgmSrvr] INFO -- Node 3: Start phase 2 completed (initial start)
2005-06-23 14:47:45 [MgmSrvr] INFO -- Node 3: Start phase 3 completed (initial start)
2005-06-23 14:47:45 [MgmSrvr] INFO -- Node 2: Start phase 3 completed (initial start)
2005-06-23 14:47:49 [MgmSrvr] INFO -- Node 2: Start phase 4 completed (initial start)
2005-06-23 14:47:49 [MgmSrvr] INFO -- Node 3: Start phase 4 completed (initial start)
2005-06-23 14:47:51 [MgmSrvr] INFO -- Node 2: Local checkpoint 1 started. Keep GCI = 1 oldest restorable GCI = 1
2005-06-23 14:47:52 [MgmSrvr] WARNING -- Allocate nodeid (0) failed. Connection from ip 10.1.1.12. Returned error string "Connection done from
wrong host ip 10.1.1.12."
2005-06-23 14:47:52 [MgmSrvr] INFO -- Mgmt server state: node id's 1 not connected but reserved
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: Start phase 5 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: Start phase 6 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: President restarts arbitration thread [state=1]
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 3: Start phase 5 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 3: Start phase 6 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 3: Start phase 7 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: Start phase 7 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: Communication to Node 4 opened
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: Communication to Node 0 opened
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: Start phase 8 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: Start phase 9 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: Started (version 4.1.12)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 3: Communication to Node 4 opened
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 3: Communication to Node 0 opened
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 3: Start phase 8 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 3: Start phase 9 completed (initial start)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 3: Started (version 4.1.12)
2005-06-23 14:47:52 [MgmSrvr] INFO -- Node 2: Node 1: API version 4.1.12
2005-06-23 14:47:53 [MgmSrvr] INFO -- Node 3: Prepare arbitrator node 1 [ticket=707c0001a96b6179]
2005-06-23 14:47:53 [MgmSrvr] INFO -- Node 2: Started arbitrator node 1 [ticket=707c0001a96b6179]
2005-06-23 14:47:53 [MgmSrvr] INFO -- Node 3: Node 1: API version 4.1.12
2005-06-23 14:47:55 [MgmSrvr] WARNING -- Allocate nodeid (0) failed. Connection from ip 10.1.1.12. Returned error string "Connection done from
wrong host ip 10.1.1.12."
2005-06-23 14:47:55 [MgmSrvr] INFO -- Mgmt server state: node id's 1 not connected but reserved
2005-06-23 14:47:58 [MgmSrvr] WARNING -- Allocate nodeid (0) failed. Connection from ip 10.1.1.12. Returned error string "Connection done from
wrong host ip 10.1.1.12."
2005-06-23 14:47:58 [MgmSrvr] INFO -- Mgmt server state: node id's 1 not connected but reserved
2005-06-23 14:48:01 [MgmSrvr] WARNING -- Allocate nodeid (0) failed. Connection from ip 10.1.1.12. Returned error string "Connection done from
wrong host ip 10.1.1.12."
2005-06-23 14:48:01 [MgmSrvr] INFO -- Mgmt server state: node id's 1 not connected but reserved
2005-06-23 14:48:04 [MgmSrvr] WARNING -- Allocate nodeid (0) failed. Connection from ip 10.1.1.12. Returned error string "Connection done from
wrong host ip 10.1.1.12."
2005-06-23 14:48:04 [MgmSrvr] INFO -- Mgmt server state: node id's 1 not connected but reserved
2005-06-23 14:48:23 [MgmSrvr] INFO -- Node 2: Node 4 Connected
2005-06-23 14:48:23 [MgmSrvr] INFO -- Node 2: Node 4: API version 4.1.12

--------------
mysql err log from data 2:

050623 14:28:17 mysqld started
050623 14:28:17 InnoDB: Started; log sequence number 0 43634
Configuration error: Could not alloc node id at 10.1.1.10 port 1186: Connection done from wrong host ip 10.1.1.12.
050623 14:28:30 [ERROR] Can't init databases
050623 14:28:30 [ERROR] Aborting

050623 14:28:30 InnoDB: Starting shutdown...
050623 14:28:32 InnoDB: Shutdown completed; log sequence number 0 43634
050623 14:28:32 [Note] /usr/local/libexec/mysqld: Shutdown complete

050623 14:28:32 mysqld ended

------

Options: ReplyQuote


Subject
Views
Written By
Posted
Problem with simple cluster?
13066
June 23, 2005 04:04PM
5010
June 24, 2005 12:16PM
4430
June 25, 2005 12:22PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.