MySQL Forums
Forum List  »  NDB clusters

Error in ndb notes failover
Posted by: Leo Chan
Date: November 21, 2005 12:56AM

Hi,

I have 3 server that used to the MySQL cluster. The config.ini is as following:
[NDBD DEFAULT]
NoOfReplicas=2 # Number of replicas
DataMemory=350M # How much memory to allocate for data storage
IndexMemory=100M # How much memory to allocate for index storage
# For DataMemory and IndexMemory, we have used the
# default values. Since the "world" database takes up
# only about 500KB, this should be more than enough for
# this example Cluster setup.


# Management process options:
[NDB_MGMD]
hostname=192.168.0.26 # Hostname or IP address of MGM node
datadir=/usr/local/mysql/cluster # Directory for MGM node logfiles
[NDB_MGMD]
hostname=192.168.0.27 # Hostname or IP address of MGM node
datadir=/usr/local/mysql/cluster # Directory for MGM node logfiles

# Options for data node "A":
[NDBD]
# (one [NDBD] section per data node)
hostname=192.168.0.26 # Hostname or IP address
datadir=/usr/local/mysql/data # Directory for this data node's datafiles

# Options for data node "B":
[NDBD]
hostname=192.168.0.27 # Hostname or IP address
datadir=/usr/local/mysql/data # Directory for this data node's datafiles

# SQL node options:
[MYSQLD]
hostname=192.168.0.30 # Hostname or IP address
[MYSQLD]
hostname=192.168.0.27


The print out from ndb_mgm is as following :

################## Initail run ##################
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: 192.168.0.27:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=3 @192.168.0.26 (Version: 5.0.15, Nodegroup: 0, Master)
id=4 @192.168.0.27 (Version: 5.0.15, Nodegroup: 0)

[ndb_mgmd(MGM)] 2 node(s)
id=1 @192.168.0.26 (Version: 5.0.15)
id=2 @192.168.0.27 (Version: 5.0.15)

[mysqld(API)] 2 node(s)
id=5 @192.168.0.30 (Version: 5.0.15)
id=6 @192.168.0.27 (Version: 5.0.15)
###############################################

I think here everything is correct as I insert "1" records in 192.168.0.27 and viewable on 192.168.0.30. I have configured 192.168.0.30 mysqld(API) is point to the ndbd process on 192.168.0.26 and 192.168.0.27 is pointed to the ndbd on 192.168.0.27. I configured that in /etc/my.cnf. Afterwards, I unplugged the cable on one ndb 192.168.0.26, I found I cannot access the table that running as ndbcluster. The mysql client give this error: "(ERROR 1015 (HY000): Can't lock file (errno: 4009))". I ps -elf |grep ndbd on 192.168.0.27, I do not found any ndbd process is running. When initial run, I use ndbd --initial and I can grep those process. The ndb_mgm show following:
################# 192.168.0.26 is unplugged time @14:26 ###########
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: 192.168.0.27:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=3 (not connected, accepting connect from 192.168.0.26)
id=4 (not connected, accepting connect from 192.168.0.27)

[ndb_mgmd(MGM)] 2 node(s)
id=1 (not connected, accepting connect from 192.168.0.26)
id=2 @192.168.0.27 (Version: 5.0.15)

[mysqld(API)] 2 node(s)
id=5 (not connected, accepting connect from 192.168.0.30)
id=6 (not connected, accepting connect from 192.168.0.27)
######################################################

I run ndbd again on 192.168.0.27 and the ndb_mgm show this after few minutes
################# 192.168.0.26 after ndbd and wait few minutes #######
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=3 (not connected, accepting connect from 192.168.0.26)
id=4 @192.168.0.27 (Version: 5.0.15, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 2 node(s)
id=1 (not connected, accepting connect from 192.168.0.26)
id=2 @192.168.0.27 (Version: 5.0.15)

[mysqld(API)] 2 node(s)
id=5 @192.168.0.30 (Version: 5.0.15)
id=6 @192.168.0.27 (Version: 5.0.15)
#######################################################
At this point, I can access the table back on 192.168.0.27 or 30 but need to wait for almost 5 minutes but this is too long for me in a cluster environment as a selling point to my client. I insert 2 records and is fine, so "3" records in table in total.

Finally, I now re-plugged the server 192.168.0.26 that is the ndb node that unplugged before. The ndb_mgm show followings
################# 192.168.0.26 is plugged ###############
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: 192.168.0.27:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=3 @192.168.0.26 (Version: 5.0.15, Nodegroup: 0, Master)
id=4 @192.168.0.27 (Version: 5.0.15, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 2 node(s)
id=1 @192.168.0.26 (Version: 5.0.15)
id=2 @192.168.0.27 (Version: 5.0.15)

[mysqld(API)] 2 node(s)
id=5 @192.168.0.30 (Version: 5.0.15)
id=6 @192.168.0.27 (Version: 5.0.15)
########################################################

Afterthat, when I query on 192.168.0.30, it will return 1 records (old image in 26) and I query again it will return 3 records(new image in 27). If I repeating the query, it will give out the image of 26 and 27 simultaneously. However, if i make query to 192.168.0.27, it only return new image, 3 records.

1. Why the ndbd process is disappeared on 192.168.0.27 after 192.168.0.26 is unplugged.
2. The resume time of services on 192.168.0.27 and 30 need more than minutes and quite long the a cluster environment
3. Why after re-plugged 192.168.0.26, 192.168.0.30 can query new image on 27 and old image on 26 simultaneously. This is data concurrption.

Options: ReplyQuote


Subject
Views
Written By
Posted
Error in ndb notes failover
3583
November 21, 2005 12:56AM
1601
November 21, 2005 01:00AM
1814
November 21, 2005 05:30AM
1591
November 22, 2005 01:09AM
2448
November 30, 2005 01:43AM
1676
December 01, 2005 06:07PM
1629
December 03, 2005 02:18AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.