We're having an issue with both of the data nodes in our cluster connecting successfully then, after a very short time, disconnecting. Checking the log files of these nodes reveals that the 'Ndb kernel thread 0 is stuck' in either 'Job Handling' or we've also had it stuck in 'Performing Send'. I've looked around online and do see that there's been issues with Intel Xeon processors and having NUMA enabled (which we have) however, on another machine with an Intel Xeon and NUMA as well, we have another cluster with a very similar configuration that is working just fine. We currently have two sql + management nodes and two data nodes. The following is from the log file where it first got hung up on 'Job Handling' and then, after trying again, got hung up on 'Performing Send':
Our configs are as follows:
We have tried re-imaging this cluster to no avail with the same issues. Any suggestions would be greatly appreciated!
Edited 1 time(s). Last edit at 01/15/2018 05:31PM by Andrew Fisher.