MySQL Forums
Forum List  »  Performance

Re: Analysis: INSERT ON DUPLICATE UPDATE VS UPDATE
Posted by: Ted Wennmark
Date: May 25, 2021 12:17AM

Hello,

Something that is really bad for NDB internal batching is INSERTS with dependencies, like "INSERT IGNORE" or "INSERT ON DUPLICATE KEY".

Using "INSERT IGNORE" or "INSERT ON DUPLICATE KEY" you break batching in NDB Cluster, all ROWS are inserted one-by-one and not in bathes this is why NDB is slow in this case.
The solution is not handle this in the application code, fetch all the rows you want to insert then run a set of UPDATE/INSERT statements.

Simple test with "INSERT IGNORE":
1) Create simple test table
CREATE TABLE subscribers (
email VARCHAR(50) NOT NULL PRIMARY KEY
) engine=ndbcluster;

2) Generate some test data, only values part:
for i in {1..1000}; do echo "('kalle$i@gmail.com'),"; done > 1000-VALUES.sql
for i in {1..500}; do echo "('kalle$i@gmail.com'),"; done > 500-VALUES.sql

3) Run some tests:

Insert 500 values in one INSERT:
[opc@student1-server1 ~]$ time mysql -uroot -S /tmp/mysql.mycluster.50.sock ted < INSERT-500.sql
real 0m0.024s
user 0m0.006s
sys 0m0.002s

Insert 1000 values using INSERT IGNORE (500 duplicates)
[opc@student1-server1 ~]$ time mysql -uroot -S /tmp/mysql.mycluster.50.sock ted < INSERT-IGNORE-1000.sql
real 0m0.695s
user 0m0.005s
sys 0m0.003s

Truncate table and run some more test:
time mysql -uroot -S /tmp/mysql.mycluster.50.sock ted -se "truncate table subscribers"

Insert 1000 values:
[opc@student1-server1 ~]$ time mysql -uroot -S /tmp/mysql.mycluster.50.sock ted < INSERT-1000.sql
real 0m0.025s
user 0m0.007s
sys 0m0.001s

So, to summarize:
- Insert of 500 rows 0.024s
- Insert of 1000 rows 0.025s
- Insert ignore (with 500 duplicates) 0.695s ...

As you can see INSERT IGNORE is not optimal, it's much better (quicker) to first read all rows, remove duplicates from insert statement and then run the insert.

Last test, insert ignore with all duplicates:

First insert 1000 rows:
[opc@student1-server1 ~]$ time mysql -uroot -S /tmp/mysql.mycluster.50.sock ted < INSERT-1000.sql
real 0m0.025s
user 0m0.007s
sys 0m0.001s

Then insert 1000 rows with INSERT IGNORE (all duplicates)
[opc@student1-server1 ~]$ time mysql -uroot -S /tmp/mysql.mycluster.50.sock ted < INSERT-IGNORE-1000.sql
real 0m0.402s
user 0m0.003s
sys 0m0.004s

And again ....
[opc@student1-server1 ~]$ time mysql -uroot -S /tmp/mysql.mycluster.50.sock ted < INSERT-IGNORE-1000.sql
real 0m0.419s
user 0m0.007s
sys 0m0.001s

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Analysis: INSERT ON DUPLICATE UPDATE VS UPDATE
443
May 25, 2021 12:17AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.