MySQL Forums
Forum List  »  NDB clusters

cluster crashed
Posted by: power wang
Date: July 05, 2005 10:16PM

I set up a cluster environment composed of four machines.
Each machine has individual role.
One of them act as MGM node(192.168.0.29),and two storage node(192.168.0.27/28),then a mysql node(192.168.0.26).
Each of them runs on Fedroa 4.And the cluster use Version 5.0.7 (beta).

The config.ini config file is here:
[NDB_MGMD DEFAULT]
[MYSQLD DEFAULT]
[TCP DEFAULT]

[NDBD DEFAULT]
NoOfReplicas=2
DataDir=/var/lib/mysql-cluster
FileSystemPath=/var/lib/mysql-cluster
DataMemory=512M
IndexMemory=128M
NoOfFragmentLogFiles=300
MaxNoOfAttributes=10000
MaxNoOfTables=1024
MaxNoOfOrderedIndexes=1024
#MaxNoOfConcurrentOperations=250000

[NDB_MGMD]
hostname=192.168.0.29
DataDir=/var/lib/mysql-cluster
LogDestination=FILE:filename=cluster.log,maxsize=1000000,maxfiles=6

[NDBD]
hostname=192.168.0.27

[NDBD]
hostname=192.168.0.28

[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]

The my.cnf file is:
[MYSQLD] #Options for mysqld process:
ndbcluster #run NDB engine
ndb-connectstring=192.168.0.29 #location of MGM node

[MYSQL_CLUSTER] #Options for ndbd process:
ndb-connectstring=192.168.0.29 #location of MGM node

The cluster starts up sucessful and runs calmly.
But it crashed when i have a test on it according the mysql test suite.
Here is the command i run:
./test-create --fast --verbose --host='192.168.0.26' --user='test' --password='pass' --database=sq_test --log --tcpip
When cluster crashed,i get this info in the cluster.log:
2005-07-06 10:38:47 [MgmSrvr] INFO -- Node 3: Started arbitrator node 1 [ticket=5dbd0001ea02764a]
2005-07-06 10:47:44 [MgmSrvr] INFO -- Node 3: Data usage increased to 80%(13139 32K pages of total 16384)
2005-07-06 10:48:32 [MgmSrvr] INFO -- Node 3: Data usage increased to 90%(14811 32K pages of total 16384)
I cann't start the cluster then.
I realized that the data memory is nearly full.Then i modify the config.ini:
DataMemory=768M
IndexMemory=200M
(The total physisc is 1G)
And then restart the cluster,but it does not work.I got this info:
2005-07-06 12:46:59 [MgmSrvr] INFO -- NDB Cluster Management Server. Version 5.0.7 (beta)
2005-07-06 12:46:59 [MgmSrvr] INFO -- Id: 1, Command port: 1186
2005-07-06 12:49:09 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 reserved for ip 192.168.0.27, m_reserved_nodes 00000000
00000006.
2005-07-06 12:49:09 [MgmSrvr] INFO -- Node 1: Node 2 Connected
2005-07-06 12:49:10 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 freed, m_reserved_nodes 0000000000000002.
2005-07-06 12:49:43 [MgmSrvr] INFO -- Node 2: Start phase 1 completed
2005-07-06 12:50:43 [MgmSrvr] INFO -- Node 2: Start phase 2 completed (system restart)
2005-07-06 12:50:43 [MgmSrvr] INFO -- Node 2: Start phase 3 completed (system restart)
2005-07-06 12:50:43 [MgmSrvr] ALERT -- Node 1: Node 2 Disconnected
In the node2's log file ndb_2_error.log, I found this info:
Date/Time: Wednesday 6 July 2005 - 12:50:43
Type of error: error
Message: Internal program error (failed ndbrequire)
Fault ID: 2341
Problem data: DbdihMain.cpp
Object of reference: DBDIH (Line: 11757) 0x0000000a
ProgramName: ndbd
ProcessID: 3482
TraceFile: /var/lib/mysql-cluster/ndb_2_trace.log.13
Version 5.0.7 (beta)
***EOM***
and this info the other node:
Date/Time: Wednesday 6 July 2005 - 12:27:26
Type of error: error
Message: Job buffer congestion
Fault ID: 2334
Problem data: Job Buffer Full
Object of reference: APZJobBuffer.C
ProgramName: ndbd
ProcessID: 3193
TraceFile: /var/lib/mysql-cluster/ndb_3_trace.log.10
Version 5.0.7 (beta)
***EOM***

I simaple to know what shall i do now.
If the data can retrieval? And how can i get useful info from trace.log?
Any ideas?
Thank
Best regards

Options: ReplyQuote


Subject
Views
Written By
Posted
cluster crashed
2488
July 05, 2005 10:16PM
1735
July 05, 2005 10:20PM
1702
July 06, 2005 05:07AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.