Hello again,
Sounds like a good approach, if you can re-write "INSERT IGNORE" and "INSERT ON DUPLICATE KEY" to normal INSERT/UPDATE statements you will see that NDB will be much more performant.
Please let me explain, if you use multi-insert or multi-update statements batching of rows will happen automatically between MySQL server and data nodes in cluster.
This functionality is controlled by the parameters:
-
https://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-api-definition.html#ndbparam-api-batchbytesize
-
https://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-api-definition.html#ndbparam-api-batchsize
For INSERTS with dependencies (like IGNORE or ON DUPLICATE KEY) the internal NDB batching is not working, this means that we are sending row by row between MySQL Server and data nodes. For a insert of 1000 rows this means 1000 round-trips over the network, this will not be very performant, if you do a normal INSERT with multiple values the internal batching kicks in and you are perhaps only doing 3-5 round trips to insert the same amount of data.
Have a good evening!
/Ted