MySQL Forums
Forum List  »  NDB clusters

ndbd not responding but not dead
Posted by: Patrick Chun
Date: June 27, 2005 05:17PM

Jonathan Miller wrote:
> Patrick,
>
> I have not seen it fail in my testing, but I have
> always been actually logged into the ndb_mgm
> client and issued the shutdown from inside the
> client. It should not act differently. I will
> start using the command-line option to shutdown
> the cluster system during my testing and see if I
> can reproduce this. You might try the shutdowns
> from inside the ndb_mgm client (logged in) and see
> if you get different results.
>
> Thanks!
>

Dear Jonathan,

We execute the '/sur/bin/ndb_mgm -e shutdown' command in a script. This is because, in this way, we can build our server for detecting if/when the ndbd node is down, we can restarting it automcatically in the script.


We have more evidence today:

When we tried to access the cluster at around 10am today, we got error message thru our application saying:

Can't lock file (errno: 4009)

We have also found in our error log file 'ndb_2_error.log':

Date/Time: Monday 27 June 2005 - 05:03:18
Type of error: error
Message: System error
Fault ID: 2303
Problem data: Node 2 killed this node because GCP stop was detected
Object of reference: NDBCNTR (Line: 193) 0x0000000a
ProgramName: /usr/sbin/ndbd
ProcessID: 3529
TraceFile: /var/lib/mysql-cluster/ndb_2_trace.log.7
Version 4.1.12

The ndb process apparently became unresponsive at 5:03am, but, at around 10am, it can still be seen when using something like 'ps aux'.

This machine was running seemingly normally on the day before; we were doing normal SQL SELECT/UPDATE/INSERT without any apparent problem. Curiously, this machine was not used at or around 5am, Monday -- the time of crash; it was just sitting in our lab.

When our team got back to work in the morning and when we tried to execute the above-mentioned command '/sur/bin/ndb_mgm -e shutdown', we can see that this didn't kill the ndbd process(es) even though 'ps aux' can still see them.

This symptom is exactly like that when we first posted at the beginning of this thread.

Hope this provides more clues.

Yours,
Patrick

Options: ReplyQuote


Subject
Views
Written By
Posted
ndbd not responding but not dead
1716
June 27, 2005 05:17PM
1617
August 03, 2005 11:31AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.