MySQL Forums
Forum List  »  NDB clusters

Backups failing
Posted by: Josh Nudell
Date: October 20, 2008 09:20AM

Hello,

I'm trying to run online backups on my cluster. However, everytime I run a backup one of my nodes disconnects and the backup fails (The nodes alternate between disconnects). After checking the data directory for mysql, I see a new folder and files created for a backup, but I don't know if they're real backups or midway failed backups.

Here is the process for the backups and the log information.

[mysql03][/var/lib/mysql-cluster]root (520)# ndb_mgm -e "start backup"
Connected to Management Server at: localhost:1186
Waiting for completed, this may take several minutes


2008-10-20 11:04:10 [MgmSrvr] INFO -- Node 2: Backup 8 started from node 1
2008-10-20 11:04:11 [MgmSrvr] ALERT -- Node 1: Node 2 Disconnected
2008-10-20 11:04:11 [MgmSrvr] ALERT -- Node 3: Node 2 Disconnected
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: Communication to Node 2 closed
2008-10-20 11:04:11 [MgmSrvr] ALERT -- Node 3: Network partitioning - arbitration required
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: President restarts arbitration thread [state=7]


Backup failed
* 3001: Could not start backup
* Backup abortet due to node failure: Permanent error: Internal error


2008-10-20 11:04:11 [MgmSrvr] ALERT -- Node 2: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
2008-10-20 11:04:11 [MgmSrvr] ALERT -- Node 3: Arbitration won - positive reply from node 1
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: GCP Take over started
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: DICT: lock bs: 0 ops: 0 poll: 0 cnt: 0 queue:
2008-10-20 11:04:11 [MgmSrvr] ALERT -- Node 3: Backup 8 started from 1 has been aborted. Error: 1326
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: LCP Take over started
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: ParticipatingDIH = 0000000000000000
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: ParticipatingLQH = 0000000000000000
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=0 0000000000000000]
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_From_Master_Received = 1
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: GCP Take over completed
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: kk: 244514/0 0 1
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: LCP Take over completed (state = 4)
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: ParticipatingDIH = 0000000000000000
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: ParticipatingLQH = 0000000000000000
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=0 0000000000000000]
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2008-10-20 11:04:11 [MgmSrvr] INFO -- Node 3: m_LCP_COMPLETE_REP_From_Master_Received = 1
2008-10-20 11:04:12 [MgmSrvr] INFO -- Node 3: Started arbitrator node 1 [ticket=a481000c1ac8d19c]
2008-10-20 11:04:15 [MgmSrvr] INFO -- Node 3: Communication to Node 2 opened

[mysql03][/var/lib/mysql-cluster]root (521)# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2 (not connected, accepting connect from xxx.xxx.xxx.21)
id=3 @xxx.xxx.xxx.22 (mysql-5.1.23 ndb-6.2.15, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1 @xxx.xxx.xxx.23 (mysql-5.1.23 ndb-6.2.15)

[mysqld(API)] 2 node(s)
id=4 @xxx.xxx.xxx.21 (mysql-5.1.23 ndb-6.2.15)
id=5 @xxx.xxx.xxx.22 (mysql-5.1.23 ndb-6.2.15)


Can anyone help shed some light on this issue?

Thanks,

-Josh

Options: ReplyQuote


Subject
Views
Written By
Posted
Backups failing
2600
October 20, 2008 09:20AM
1795
October 20, 2008 11:39PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.