MySQL :: Re: 5.1.6-alpha NDB: Could not get apply status share

New Topic

Re: 5.1.6-alpha NDB: Could not get apply status share

Posted by: jim shnider
Date: March 28, 2006 04:44PM

my previous post was somewhat naive, but I am still having a problem

I read a concurrent thread: 'Mysql Cluster: Unable to create table.. Table exists error (ERROR 1050)' by Annapoorani SundarRajan and Gabriel Harriman. At the end of this thread, Gabe explains the cause of his problem: that ndbd was not fully started, but hung in one of its startup phases.

His solution was to reinitialize the ndbd fs, then wait to start the mysqld until all data nodes were 'started'. This allowed his mysqld to create tables.

I was distracted because the API node was not registering as 'connected' with the mgm console. I missed that the data node was still doing its 'starting' thing, and was stuck in Phase 1.

Some log entries are generated after calling 'ndbd' when the mgm node is active:

(ndbd log - /var/lib/mysql-cluster/ndb_2_out.log)
<quote>
2006-03-28 15:43:35 [ndbd] INFO -- Angel pid: 3666 ndb pid: 3667
2006-03-28 15:43:35 [ndbd] INFO -- NDB Cluster -- DB node 2
2006-03-28 15:43:35 [ndbd] INFO -- Version 5.0.19 --
2006-03-28 15:43:35 [ndbd] INFO -- Configuration fetched at 192.168.0.20 por
t 1186
2006-03-28 15:43:36 [ndbd] INFO -- Start initiated (version 5.0.19)
</quote>

At this point, the mgm console reports:
<quote>
ndb_mgm> show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 3 node(s)
id=2 @192.168.0.212 (Version: 5.0.19, starting, Nodegroup: 0, Master)
id=3 (not connected, accepting connect from 192.168.0.214)
id=4 (not connected, accepting connect from 192.168.0.216)

[ndb_mgmd(MGM)] 1 node(s)
id=1 @192.168.0.20 (Version: 5.1.7)

[mysqld(API)] 3 node(s)
id=5 (not connected, accepting connect from any host)
id=6 (not connected, accepting connect from any host)
id=7 (not connected, accepting connect from any host)

ndb_mgm> 2 status
Node 2: starting (Phase 1) (Version 5.0.19)
</quote>

and the mgmd writes a few lines to its log:

(ndb_mgmd log - $work_dir/ndb_1_cluster.log)
<quote>
2006-03-28 15:46:02 [MgmSrvr] INFO -- Shutdown complete 2006-03-28 15:46:51 [MgmSrvr] INFO -- NDB Cluster Management Server. Version
5.1.7 (beta) 2006-03-28 15:46:51 [MgmSrvr] INFO -- Id: 1, Command port: 1186
2006-03-28 15:47:31 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 reserved for ip 192.168.0.212, m_reserved_nodes 0000000000000006.
2006-03-28 15:47:31 [MgmSrvr] INFO -- Node 1: Node 2 Connected 2006-03-28 15:47:32 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 freed, m_r
eserved_nodes 0000000000000002. 2006-03-28 15:48:04 [MgmSrvr] INFO -- Node 2: Start phase 1 completed
</quote>

Perhaps fortuitously, I became distracted by other matters while preparing this post. When I returned, I wanted to verify my expectation (from past trials) that the mgm console would report an error when trying to stop the data node:

<quote>
ndb_mgm> 2 stop
Node 2 has shutdown.

ndb_mgm> show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 3 node(s)
id=2 @192.168.0.212 (Version: 5.0.19, starting, Nodegroup: 0, Master)
id=3 (not connected, accepting connect from 192.168.0.214)
id=4 (not connected, accepting connect from 192.168.0.216)

[ndb_mgmd(MGM)] 1 node(s)
id=1 @192.168.0.20 (Version: 5.1.7)

[mysqld(API)] 3 node(s)
id=5 (not connected, accepting connect from any host)
id=6 (not connected, accepting connect from any host)
id=7 (not connected, accepting connect from any host)

ndb_mgm> 2 status
Node 2: starting (Phase 2) (Version 5.0.19)
</quote>

The error I expected was not reported, the stop operation still failed, but the data node had managed to move-on to Phase 2 (either because it was left for a long time or because it received a 'stop' command...).

FYI: 'restart' (but not 'stop') works most of the time when the data node is still in phase 1, and 'killall ndbd' stops ndbd and notifies the mgm node that it is stopping (due to signal 15)

After killall, the ndbd manages to log:
<quote>
2006-03-28 15:20:26 [ndbd] INFO -- Received signal 15. Performing stop.
2006-03-28 15:20:26 [ndbd] INFO -- Shutdown initiated
2006-03-28 15:20:26 [ndbd] INFO -- Shutdown completed - exiting
2006-03-28 15:20:26 [ndbd] INFO -- Angel shutting down
2006-03-28 15:20:26 [ndbd] INFO -- Node 2: Node shutdown completed. Initiate
d by signal 15.
</quote>

Now I am tempted to wait and see if the ndbd gets all the way through to 'started' if I just wait...

Quite clearly, I am struggling with the mysteriousness of all this.

Comments? Suggestions?

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

5.1.6-alpha NDB: Could not get apply status share

3969

David Abbott

February 24, 2006 10:08AM

Re: 5.1.6-alpha NDB: Could not get apply status share

1785

David Abbott

February 28, 2006 02:05PM

Re: 5.1.6-alpha NDB: Could not get apply status share

1658

Stewart Smith

March 02, 2006 04:35PM

Re: 5.1.6-alpha NDB: Could not get apply status share

1920

Eric Gonia

March 03, 2006 02:19PM

Re: 5.1.6-alpha NDB: Could not get apply status share

1733

Stewart Smith

March 05, 2006 06:25PM

Re: 5.1.6-alpha NDB: Could not get apply status share

2238

jim shnider

March 28, 2006 01:07AM

Re: 5.1.6-alpha NDB: Could not get apply status share

2043

jim shnider

March 28, 2006 04:44PM

Re: 5.1.6-alpha NDB: Could not get apply status share

1752

jim shnider

March 28, 2006 10:53PM

Re: 5.1.6-alpha NDB: Could not get apply status share

3321

jim shnider

March 29, 2006 03:01PM

Re: 5.1.6-alpha NDB: Could not get apply status share

3302

jim shnider

March 29, 2006 03:16PM

Re: 5.1.6-alpha NDB: Could not get apply status share

2302

jim shnider

March 29, 2006 05:06PM

Re: 5.1.6-alpha NDB: Could not get apply status share

1861

Stewart Smith

March 29, 2006 09:56PM

Re: 5.1.6-alpha NDB: Could not get apply status share

1861

jim shnider

March 29, 2006 10:42PM

Re: 5.1.6-alpha NDB: Could not get apply status share

1777

jim shnider

March 29, 2006 11:46PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.