Re: problem restarting cluster - please help
Jonathan Miller wrote:
> Node 3 should give you an error log and a trace
> file. The information in these would help to see
> what the problem is.
Yes, sure, here comes the missing informations, you're right..
Node 3 error log:
Date/Time: Monday 27 June 2005 - 23:31:41
Type of error: error
Message: Internal program error (failed ndbrequire)
Fault ID: 2341
Problem data: DbtupExecQuery.cpp
Object of reference: DBTUP (Line: 443) 0x00000008
ProgramName: /usr/sbin/ndbd
ProcessID: 28734
TraceFile: /var/lib/mysql-cluster/ndb_3_trace.log.19
Version 4.1.12
***EOM***
for trace file, these are the last lines, just tell me if more of this file is needed:
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 6922838 gsn: 264 "FSREADREQ" prio: 0
s.bn: 248 "DBACC", s.proc: 3, s.sigId: 6922837 length: 8 trace: 0 #sec: 0 fragInf: 0
UserPointer: 1
FilePointer: 1135
UserReference: H'00f80003 Operation flag: H'00000000 (No sync, Format=List of pairs)
varIndex: 1
numberOfPages: 1
pageData: H'00000010, H'00000000
--------------- Signal ----------------
r.bn: 252 "QMGR", r.proc: 3, r.sigId: 6922836 gsn: 164 "CONTINUEB" prio: 0
s.bn: 252 "QMGR", s.proc: 3, s.sigId: 6922834 length: 1 trace: 0 #sec: 0 fragInf: 0
H'00000004
>
> You should be able to stop and start the cluster
> without having to do --initial and a restore.
>
> Also seeing the config.ini would be nice as well.
note on systems: storage nodes are two amd64 with 8Gb ram; on each we have one storage node and one api node. the storage nodes are connected with Gb ethernet (cross cable).
management node is on a x86 server, on a different network (i.e. communications between mgm and storage nodes crosses a router, as one can see from config.ini)
all nodes uses version 4.1.12 (mysq, ndb,management)
backups,restore and normal operations works just fine, I can kill one storage node, restart it without any problem.
config.ini:
[NDBD DEFAULT]
NoOfReplicas=2
DataDir=/var/lib/mysql-cluster
DataMemory=5632M
IndexMemory=1700M
MaxNoOfAttributes=10000
MaxNoOfTables=1024
MaxNoOfOrderedIndexes=1024
MaxNoOfUniqueHashIndexes=512
MaxNoOfConcurrentTransactions=131072
MaxNoOfConcurrentOperations=1048576
TimeBetweenLocalCheckpoints=23
TimeBetweenGlobalCheckpoints=32000
NoOfFragmentLogFiles=200
# TimeBetweenWatchDogCheck=60000000
TimeBetweenWatchDogCheck=10000
TimeBetweenInactiveTransactionAbortCheck=2000
TransactionDeadlockDetectionTimeout=6000
NoOfDiskPagesToDiskAfterRestartTUP=160
NoOfDiskPagesToDiskAfterRestartACC=80
MaxNoOfTriggers=1536
UndoIndexBuffer=6MB
UndoDataBuffer=48MB
RedoBuffer=128MB
StopOnError=false
#LOG LEVEL
LogLevelStartup=15
LogLevelShutdown=15
LogLevelStatistic=15
LogLevelCheckpoint=15
LogLevelNodeRestart=15
LogLevelConnection=15
LogLevelError=15
LogLevelInfo=15
#OTRAS
StartFailureTimeout=1800000
# StartFailureTimeout=0
# LockPagesInMainMemory=1
[MYSQLD DEFAULT]
ArbitrationRank=0
[NDB_MGMD DEFAULT]
[TCP DEFAULT]
# Managment Server
[NDB_MGMD]
Id=1
HostName=xxx.yyy.zzz.106 # the IP of THIS SERVER
DataDir=/var/lib/mysql-cluster
# Storage Engines
[NDBD]
Id=2
HostName=xxx.yyy.kkk.2 # the IP of the SECOND SERVER
#
[NDBD]
Id=3
HostName=xxx.yyy.kkk.3 # the IP of the SECOND SERVER
# 2 MySQL Clients
# I personally leave this blank to allow rapid changes of the mysql clients;
# you can enter the hostnames of the above two servers here. I suggest you dont.
[MYSQLD]
#HostName=xxx.yyy.kkk.2 # the IP of the FIRST API SERVER
#Id=4
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]
#HostName=xxx.yyy.kkk.3 # the IP of the SECOND API SERVER
#Id=5
[TCP]
NodeId1=2
NodeId2=3
HostName1=192.168.50.2
HostName2=192.168.50.3
many thanks for any help.