Re: Modify primary key for partitioning
>(I'm not sure the PK is used if ID_MSG is the first column in the index)
I tried the opposite order, it uses the index for the received_date but (as expected) not for ID_MSG (when received_date is not in the and I got some UPDATE on this column which took ages. So I decided to let the index like that (`ID_MSG`,`RECEIVED_DATE`)
>`ID` UNSIGNED INT NOT NULL is more appropriate, as it would help reduce length of secondary keys
Actually this column is made up of characters and integers.
>How much RAM do you have on the server? assuming at least 8GB or more.
I currently use 2GB for the server (dedicated for the database), and around 1GB for the database.
It's not much I don't think the RAM is currently a problem. The main problem are (I believe) the indexes. Some tables have up to 24 indexes ! and some others don't have any. But this is another problem. If needed in the near future, I can easily get more RAM from the system engineers.
>Would you do this, for two reasons:
SELECT COUNT(*), AVG(LENGTH(XMLCERTIFICATE)), AVG(LENGTH(XMLPOSTIT))
It took around 4500s to complete (on the non-partitioned table, MySQL5.0).
The result is :
COUNT(*) : 231357
AVG(LENGTH(XMLCERTIFICATE)) : 320578
AVG(LENGTH(XMLPOSTIT)) : 2511
It took 2500s to complete on the partition table (compressed) with MariaDB 5.5.
This amount of time is quite huge, but most of the queries on this table are selective, so I should be able to avoid the full table scan.
>Are your received_date BETWEENs typically on the order of 6 months? Or much less? If much less, then more, shorter, partitions may be better.
Generally speaking, the queries are based on the past 3 months. In my exemple table, I chose 6 months for each partition, but this is for my "test" database. I have to think more about the best partition size depending on the data growth.
>You are using 'COMPRESSED' row format with the table
I don't use compression on the production database. The table's size is currently 80GB. After compressing (key block size of 16ko), the size dicrease to 10GB and the performance seems to be good (I haven't made a thorough benchmark yet). I'm really satisfied of compression so far.
>How do you load data into the table? single insert or batch insert e.g. load data infile or insert into values.
Most of the scripts use "load data infile".
Here is the show table status :
*************************** 1. row ***************************
Create_options: row_format=COMPRESSED key_block_size=16 partitioned
In my case partition is useful for:
- I will get many historic data
- Split one big file (currently 80GB but it's gonna grow quickly) into many smaller files. (physically in the server)
- Prevent from a too long backup strategy. Currently a full system copy is done. I'm thinkink of synchronising the data files (database <=> last full backup), and so copy only the files that have been modified since the last backup (1 full + many incrementals betwenn full backup). Most of the time only the most recent partitions will be copied.