MySQL Forums
Forum List  »  Performance

Re: MySQL performance help
Posted by: Rick James
Date: January 08, 2012 12:04PM

> The table is partitioned for every 1 hour, example: YYYYMMDDHH.
You could (and I actually did so last month) create a single table that is PARTITIONed. Then have hourly and daily scripts to do ALTER TABLE ... REORGANIZE PARTITION to turn 24 hourly partitions in to one daily partition, etc.
Let's have some numbers -- How many days of data do you intend to keep?
(PARTITION has a limit of 1024 partitions; even that number may be impracticably high.)

I recommend hourly aggregation only (which is what I did); you can get daily stats from the hourly aggregation (summary) tables. Perhaps you (as I) also have some daily-only summary tables.

> starting with time stamp column as the first column in the key
Without seeing the SELECTs, etc, I cannot judge that choice.

> The database goes through ~50k row insertion per second.
> Insertion speed
Either of these is orders of magnitude faster than one-row-at-a-time:
* Batched inserts: INSERT ... VALUES (...), (...), ... -- recommend up to 1000 rows at a time.
* LOAD DATA

> Insertion speed
If practical, sort the rows to be inserted on the PRIMARY KEY. (Yeah, you don't have one; I may come back to that.)

> key_buffer_size = 256M
All your tables are InnoDB? Then this is wasting RAM. See
http://mysql.rjweb.org/doc.php/memory

> innodb_buffer_pool_size = 4G
> The available memory is 4 GB.
NO! See the above link. Swapping is _much_ worse than having a smaller buffer_pool.

> The average row size is 380. All the columns are big int unsigned not null.
Please provide the actual SHOW CREATE TABLE.
Please think about every BIGINT -- which ones can be INT or MEDIUMINT, etc. Smaller -> more cacheable -> faster.
Consider whether the table can be split -- do you need all the columns all the time?

> The duration can be for last 10 minutes
Maybe the PARTITIONing should be on 10-minute intervals (use RANGE). You would probably need to conver the datetime to INT UNSIGNED to make the partitioning possible.

> ~50k row insertion per second
Think seriously about _not_ keeping the raw data. Instead, collect data for, say, one minute (3M rows), aggregate it is multiple ways, store only the aggregate info into tables.
If you decide you must keep the raw data, then think about doing _files_ instead of _tables_ for each hour (or other unit of time). These files could be broken into directories based on the date (or other unit of time). A million files in a single directory is painful (that is, slow) for the Operating System).

Another tip -- repartitioning involves reading the data; this can interfere with the INSERTs. If this interference becomes a problem, then load into a temp table (perhaps Engine=MEMORY, for extra speed), then (1) copy to the 'real' table and (2) aggregate from that temp table, (3) TRUNCATE the temp table. Since that creates its own interference, ping-pong between two temp tables -- one being loaded, and the other being shovelled into other tables. If you do use memory, SET max_heap_table_size immediately before doing CREATE TABLE; also be sure to subtract that space from 'available' memory when calculating buffer_pool_size. (I did not have this issue; we received a huge dump every hour. It took 5 minutes to shovel it into two raw (partitioned on hour) tables, and another minute to do 8 aggregations into summary tables. Using MEMORY helped a lot in the performance. And I found a way to avoid unnecessarily replicating the MEMORY tables to slaves.)

Options: ReplyQuote


Subject
Views
Written By
Posted
2730
January 07, 2012 06:55AM
Re: MySQL performance help
1007
January 08, 2012 12:04PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.