MySQL Forums
Forum List  »  NDB clusters

NDB Cluster Transaction Dead lock and Possible Memory Corruption
Posted by: ArunKumar N
Date: February 19, 2014 04:42AM

Hi Experts,

We are seeing an NDB Transaction level deadlock between tow transactions and a transaction getting aborted.

Our Appication is written in C++, using NDB API.
NDB Cluster Version : MySQL 5.1.56-ndb-7.1.15-cluster

Table Schema :

+------------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+---------------------+------+-----+---------+-------+
| SessionID | varchar(92) | NO | UNI | NULL | |
| SessionState | tinyint(3) unsigned | NO | | 1 | |
| UserIPAddress | binary(4) | NO | PRI | NULL | |
+------------------+---------------------+------+-----+---------+-------+

SessiondID column is uniquley Indexed.

Let say an Entry Exists in the table as
UserIPAdress = 10.0.0.0, SessionState = 1 and SessionID = 1

When another Entry needs to added with values
UserIPAdress = 10.0.0.1, SessionState = 1 and SessionID = 1

Since UserIPAddress is Primary Key this will be inserted as a new Entry where it will throw an exception "code: 893, msg: Constraint violation e.g. duplicate value in unique index, Status: PermanentError, Classification: ConstraintViolation",Since the table already contains an Entry with SessionID = 1.

When encounter this what we do is Delete the Existing Entry with SessionID = 1 and insert the new Entry.

Basically this is what we do when we receive a request with UserIPAdress = A.B.C.D, SessionState = 1 and SessionID = 1

1) Insert an Entry using a Transaction.
2) If an Entry for the SesisonID = 1 already exists an Exception is thrown and the Transaction is aborted.
3) Start a new Transaction
3.1) Delete the Entry with SessionID = 1, using NdbIndexOperation. NO COMMIT is done.
3.2) If Delete in Step 3.1 is Success, Insert the entry IpAdress = A.B.C.D, SessionState = 1 and SessionID = 1, NO COMMIT is done
3.3) If Insert in Step 3.2 Succcess COMMIT the transaction.

We see a Deadlock in the following scenario if an entry in the table exists as
UserIPAdress = 10.0.0.0, VpnID = 1, SessionState = 1 and SessionID = 1
And we receive two requests immediately(within a fraction of seconds)
1) UserIPAdress = 10.0.0.1, SessionState = 1 and SessionID = 1
2) UserIPAdress = 10.0.0.1, SessionState = 1 and SessionID = 2

Two threads starts processing the two requests

Thread-1 Starts processing request 1
-> Finishes Step 1
-> Finishes Step 2
-> Finishes Step 3.1

Now Thread-2 starts processing request 2 parallely
-> Attempts Step 1,but is blocked waiting for Thread 1 to release lock on a row.
I am not sure which row it attempts to lock.
Now Thread-1
-> Attempts Step 3.2 but this is also blocked waiting for Thread 2 to release lock on a row

After the configured "TransactionDeadlockDetectionTimeout" timeout Transaction in Thread-2 Aborts, thread 1 continues with Step 3.2 and succeeds.

Since in Step 3.1 of Thread-1 is not commited yet it will hold lock on the row
UserIPAddress = 10.0.0.0, SessionID = 1& SessionState = 1.

But I am not sure why Step 1 in Thread-2 and Step 3.2 in Thread-2 is blocked until "TransactionDeadlockDetectionTimeout" i.e. whcih rows locks are held by which Tranasaction.

Can some one please explain on which rows Thread 1 and Thread-2 are waiting for Lock.

Also in our case we could see that a lot of transactions are aborted because of this deadlock and the application crashes due to Segmentation Fault while allocating memory via malloc(). Is there a possibilty for memory corrupting because lot of transactions are getting aborted because of Deadlock.

I say Memory Corruption because
1) There is so much free memory available for malloc() to allcoate memory
2) If NO transactions are aborted the application didn't crach and runs smooth
so my only suspect is the transctions aborted causing memory corruption.

Also the number of aborted transactions that causes the crash is insconsistent.
Sometimes only one aborted transaction causes crash, sometimes 60 sometimes 120.

Can someone please help me understand deadlock and the cause for the crash.

Thanks,
Arun

Options: ReplyQuote


Subject
Views
Written By
Posted
NDB Cluster Transaction Dead lock and Possible Memory Corruption
2164
February 19, 2014 04:42AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.