MySQL Forums
Forum List  »  Partitioning

Re: large InnoDB table partitioning without explicit PK
Posted by: Rick James
Date: November 13, 2014 06:40PM

> Played with MyISAM - (performance) results seemed to be better than InnoDB but table level locking and possible long recover after crash discorged me from using it.

Right. Go with InnoDB.

96 PARTITIONs -- that's about the practical limit. (There are inefficiencies when having too many partitions; 5.7 / 10.1 may fix some of them.)

> Estimated number of records inserted per hour: 20M (peak 50M)

50M INSERTs/hr = 14K/sec => several performance issues. First, some questions:
* Disks? Rotating? SSDs? RAIDed?
* Batch insert? LOAD DATA? Multi-threaded?
* How random are the indexes?

You really ought to have an PRIMARY KEY, even it is an artificial AUTO_INCREMENT. (If you don't specify a PK, InnoDB gives you a _hidden_ 6-byte int-like thingie.)

> 5.2.10-MariaDB
> Change version to mariadb10.0.x to support better partition pruning (it is almost not supported now)


> RANGE partitioning

Good. See my blog for more discussion:

> Add an additional index key

Caution -- each INDEX must be updated for every row INSERTed. If they are random, then this could lead to I/O thrashing.

> Enable replication (and bin log)

Fine. Note that the longer you wait, the bigger the dump/reload will be to initialize the Slave. And the gotchas may bite you.

> Use temp tables / heap storage engine to copy in data from a single partition of Table1 and play with inserted records (do aggregations and store results into other tables)

Excellent! (I won't need to lecture on such.)
I found this to be especially efficient:
1. Load batches of rows into a MEMORY table in a _separate_ database.
2. Aggregate from that table into the Summary tables
3. Load from that table into the big Fact table.
* Use RBR (Row Based Replication)
* binlog_ignore_database = the database with the MEMORY table.
This will cut down significantly on the replication traffic.

> I'm aware of costly JDBC inserts (no batch inserts, autocommit, etc.)

You _will_ need to improve on that in order to achieve 50M inserts/hour. 90% of the cost of a single-row INSERT is overhead that could be amortized if you batch the inserts.


If you can gather thousand(s) of rows at a time in the MEMORY table, then about 2 SQLs (an INSERT..SELECT..LEFT JOIN, and a UPDATE..JOIN) will
* replace the IODKU, and
* be 2 COMMITs (autocommit) instead of thousand(s)
(If you want the details on the 2 queries, I'll scratch my head and reconstruct them. Or maybe I will finish my blog on such.)

> just partition drops.

After 96 (or 36) hours?

> `col14` smallint(5) unsigned DEFAULT '0',

Two more lectures averted -- SMALLINT (and other sizes), and UNSIGNED.
What about "NOT NULL"?

> ### my.cnf

(SHOW VARIABLES is better. Possibly not necessary at the moment. But if you would like tuning advice, provide SHOW VARIABLES and SHOW GLOBAL STATUS.)

> query_cache_size=8M

Hardly worth having. And, since you have a high ingestion rate, you probably have a high QC purge rate. Recommend

> lower_case_table_names=1


> table_cache=1024
Note that each partition is effectively a "table".

> innodb_flush_log_at_trx_commit=0

Aha, you already found that one. Recommend 2 over 0; hardly any slower, possibly more secure.

> insert into Table1(col1,col2,..., col19) values (val1, val2,..., val19) ON DUPLICATE KEY UPDATE col1 = col1

Ouch. IODKU needs a UNIQUE (or PRIMARY) key in order to check for "DUPLICATE KEY"!

Did you think about INSERT IGNORE?

After you upgrade, be sure to set
innodb_buffer_pool_instances = 8 -- or some value > 1

> Average record size 209B
> 800 million of records in the table (number of partitions: 36)
> 64GB RAM,
> innodb_buffer_pool_size=32768M
> Buffer pool hit rate 1000 / 1000

--> 167 GB for the table. (Plus indexes?) However,... Before the table gets to 32GB (re buffer_pool), you _may_ start being I/O bound. This is where the randomness of the indexes, and other issues, comes into play.
The hit rate suggests that you have not hit the problem _yet_.

Oh, one more question -- Are the records inserted in roughly timestamp order? That is, is the 'latest' partition hot while the others are cold? Another thing to note: in 5.2 (I think), INSERT does no PARTITION pruning. Another reason to upgrade. (No, I don't know for a fact that 10.0's INSERT will prune.)

If all the activity is in one partition, then the "working set" is only 1.7GB (167GB/96), which is much less than 32GB. This implies that even a 'random' index won't lead to thrashing.

Too much information? Probably not. And I love expounding on scaling/performance of MySQL.

Can any of those VARCHARs be normalized in order to shrink the disk footprint? The normalization is a couple of SQLs while the data is still in the MEMORY table -- very cheap and efficient.

Options: ReplyQuote

Written By
Re: large InnoDB table partitioning without explicit PK
November 13, 2014 06:40PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.