MySQL Forums :: Partitioning :: large InnoDB table partitioning without explicit PK


Advanced Search

Re: large InnoDB table partitioning without explicit PK
Posted by: Miko M ()
Date: November 18, 2014 02:00AM

> Well, you forced me to learn something today:
> INSERT ... ON DUPLICATE KEY UPDATE, without any
> explicit PRIMARY or UNIQUE keys, will degenerate
> into a plain INSERT. (Or so it seems -- both from
> your timings, and from a test case I built.)
[Miko] Really? Good to hear that! ;-)

> But -- If you do add any PRIMARY/UNIQUE keys and
> insert into those fields, performance may decline
> since it will need to check for DUPLICATE KEY.
[Miko] There was a PK initially. A very bad one (a long, composite one: col1,col2,col3,col4,col5), which actually was useless (in terms of supporting business logic).
And I would have had to add another column - the partitioning key.
So I've decided to remove it completely, which resulted in decent writing performance.

> > [Miko] It was a 12 hour test, no PK, just two
> IDXs, the DB size was 700M rows, 250GB on a disk
>
> That was partitioned, with the partitions being
> smaller than the buffer_pool, and with "BEGIN -
> lots of inserts - COMMIT"? Then it seems
> feasible. How many connections were inserting
> simultaneously?
[Miko] 4-6 connections simultaneously.
There are BEGINs and COMMITs. But not many INSERTS between them.
To be more precise, there is usually one INSERT (between BEGIN and COMMIT) which is a real surprise to me as well (it seems that the JDBC inserting stream has been modified over the time...)


> > 72% %user, 16% idle (way too much for a regular
> usage).
>
> That sounds like multiple connections spread
> across multiple cores? Some of that would be
> mutex contention between the threads. It _might_
> be best to have the number of connections not be
> much more than the number of cores.
> What do you mean by "for regular usage"

[Miko] Regular usage - I mean that I would except such load in a peak or during backlog handling but not as average usage.
Regarding the HW, it is 2 x 6 cores Xeon Processors but there is ESXi underneath, and 8 (v)CPU are configured for that VM.

>
> Decreasing the number of indexes should decrease
> the CPU usage. Rationale:
> The data plus the PRIMARY KEY is a BTree; each
> secondary index is another BTree. A plain INSERT,
> with the relevant blocks already cached, is mostly
> locating the right spot in each BTree, then doing
> an INSERT. When a block fills up, there is a
> "block split". More indexes => more time.
>
> Batched INSERTs (not possible because of 3rd
> party) would also decrease the CPU usage.
[Miko] Absolutely agree!


[Miko] Another thing I just learnt today.
I've been checking whether I could use a table with the MEMORY engine as a cache where the data would be initially insterted to from JDBC stream.
It does not make any sense since with multiple connections the table lock is significant (I wish row locking was supported) and the limit is 7k inserts per second. However IO gain is amazing ;-)

A few numbers: the same traffic generated: six connections each contains 5k inserts in a loop:

### Memory table
MariaDB [qns]> select count(*) from Table1_memory where col20 between '2014-11-18 14:58:00' and '2014-11-18 14:58:59';
+----------+
| count(*) |
+----------+
| 405196 |
+----------+

#OS, sar
02:58:34 PM CPU %user %nice %system %iowait %steal %idle
02:58:35 PM all 22.22 0.00 8.05 0.00 0.00 69.73
02:58:36 PM all 22.12 0.00 7.96 0.00 0.00 69.91
02:58:37 PM all 23.98 0.00 8.25 0.00 0.00 67.77

### InnoDB table
MariaDB [qns]> select count(*) from Table1 where col20 between '2014-11-18 15:08:00' and '2014-11-18 15:08:59';
+----------+
| count(*) |
+----------+
| 1048850 |
+----------+

#OS, sar
03:10:45 PM CPU %user %nice %system %iowait %steal %idle
03:10:46 PM all 58.09 0.00 9.66 6.90 0.00 25.35
03:10:47 PM all 60.30 0.00 10.36 0.50 0.00 28.84
03:10:48 PM all 58.54 0.00 10.43 7.29 0.00 23.74


Another thing which I've observed is performance degradation with
INSERT1..INSERT5000 with AUTOCOMMIT=ON
in comparison to
BEGIN.. INSERT1..INSERT5000..COMMIT

21.5k inserts per sec versus 23.5k inserts per sec.
It is 10% only and I would expect a higher number here.
All in all, the results are better than the ones achieved in MariaDB 5.2.x
Another good reason to upgrade.

A few numbers again; traffic generated: eight connections each contains 5k inserts in a loop:

### BEGIN.. INSERT1..INSERT5000..COMMIT
MariaDB [qns]> select count(*) from Table1 where col20 > now() - interval 1 minute;
+----------+
| count(*) |
+----------+
| 1400066 |
+----------+
1 row in set (2.43 sec)

#OS, sar, iostat
avg-cpu: %user %nice %system %iowait %steal %idle
79.53 0.00 13.23 0.75 0.00 6.49

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 36.00 0.00 1.00 0.00 296.00 296.00 0.01 5.00 5.00 0.50
sdb 0.00 8703.00 0.00 175.00 0.00 71024.00 405.85 0.59 3.38 0.65 11.30

05:31:48 PM CPU %user %nice %system %iowait %steal %idle
05:31:49 PM all 79.95 0.00 12.28 0.63 0.00 7.14
05:31:50 PM all 80.43 0.00 13.05 0.88 0.00 5.65
05:31:51 PM all 81.76 0.00 12.33 0.38 0.00 5.53

### INSERT1..INSERT5000 with AUTOCOMMIT=ON
MariaDB [qns]> select count(*) from Table1 where col20 > now() - interval 1 minute;
+----------+
| count(*) |
+----------+
| 1291133 |
+----------+
1 row in set (2.39 sec)

06:42:04 PM CPU %user %nice %system %iowait %steal %idle
06:42:05 PM all 81.95 0.00 12.53 0.38 0.00 5.14
06:42:06 PM all 79.00 0.00 12.75 1.12 0.00 7.12
06:42:07 PM all 80.10 0.00 12.64 1.00 0.00 6.26
06:42:08 PM all 79.10 0.00 12.89 0.75 0.00 7.26

avg-cpu: %user %nice %system %iowait %steal %idle
79.97 0.00 12.39 0.63 0.00 7.01

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 8838.00 0.00 158.00 0.00 71968.00 455.49 0.57 3.60 0.70 11.10

Options: ReplyQuote


Subject Views Written By Posted
large InnoDB table partitioning without explicit PK 2456 Miko M 11/12/2014 07:41PM
Re: large InnoDB table partitioning without explicit PK 1390 Rick James 11/13/2014 06:40PM
Re: large InnoDB table partitioning without explicit PK 1220 Miko M 11/14/2014 03:02AM
Re: large InnoDB table partitioning without explicit PK 973 Rick James 11/15/2014 12:05AM
Re: large InnoDB table partitioning without explicit PK 992 Miko M 11/16/2014 07:39PM
Re: large InnoDB table partitioning without explicit PK 877 Rick James 11/17/2014 04:21PM
Re: large InnoDB table partitioning without explicit PK 976 Rick James 11/17/2014 07:48PM
Re: large InnoDB table partitioning without explicit PK 944 Miko M 11/18/2014 02:00AM
Re: large InnoDB table partitioning without explicit PK 989 Rick James 11/18/2014 11:35PM
Re: large InnoDB table partitioning without explicit PK 1034 Miko M 11/22/2014 06:44AM
Re: large InnoDB table partitioning without explicit PK 927 Miko M 11/18/2014 01:13AM
(non-unique) index more efficient than partition pruning 1152 Miko M 11/19/2014 12:27AM
Re: (non-unique) index more efficient than partition pruning 1029 Rick James 11/19/2014 06:01PM
Re: (non-unique) index more efficient than partition pruning 1056 Miko M 11/22/2014 08:39AM
Re: large InnoDB table partitioning without explicit PK 1043 Miko M 11/14/2014 03:04AM
Re: large InnoDB table partitioning without explicit PK 1094 Miko M 11/14/2014 03:05AM
Re: large InnoDB table partitioning without explicit PK 1177 Rick James 11/15/2014 12:32AM
Re: large InnoDB table partitioning without explicit PK 1112 Miko M 11/16/2014 08:31PM
Re: large InnoDB table partitioning without explicit PK 953 Rick James 11/19/2014 12:12AM
Re: large InnoDB table partitioning without explicit PK 919 Miko M 11/22/2014 09:07AM
Re: large InnoDB table partitioning without explicit PK 911 Rick James 11/23/2014 09:00PM
Re: large InnoDB table partitioning without explicit PK 999 Miko M 12/01/2014 11:16AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.