MySQL Forums
Forum List  »  Partitioning

Re: 500 mil. rows table partitioning
Posted by: Misha Krassovski
Date: July 15, 2011 12:42PM

First what I did according to your recommendations:

>Change that to be about 20% of available RAM. It's used for caching indexes -- you need this caching a lot.
I set it up exactly 20% of what I have (8GB) - 1.6GB
SHOW VARIABLES LIKE '%buffer%'; -- cache size
*************************** 1. row ***************************
Variable_name: bulk_insert_buffer_size
        Value: 8388608
*************************** 2. row ***************************
Variable_name: innodb_buffer_pool_size
        Value: 8388608
*************************** 3. row ***************************
Variable_name: innodb_log_buffer_size
        Value: 1048576
*************************** 4. row ***************************
Variable_name: join_buffer_size
        Value: 131072
*************************** 5. row ***************************
Variable_name: key_buffer_size
        Value: 1677721600
*************************** 6. row ***************************
Variable_name: myisam_sort_buffer_size
        Value: 8388608
*************************** 7. row ***************************
Variable_name: net_buffer_length
        Value: 16384
*************************** 8. row ***************************
Variable_name: preload_buffer_size
        Value: 32768
*************************** 9. row ***************************
Variable_name: read_buffer_size
        Value: 131072
*************************** 10. row ***************************
Variable_name: read_rnd_buffer_size
        Value: 262144
*************************** 11. row ***************************
Variable_name: sort_buffer_size
        Value: 2097144
*************************** 12. row ***************************
Variable_name: sql_buffer_result
        Value: OFF

>See how the refs stop at offsetid and fail to include `flag`? I'm guessing it is throwing up its hands when trying to compare the number 0 to a VARCHAR.
>Which should it be? always numeric? Or always strings? Either change the VARCHAR, or add quotes around the '0'.
Clear mistake! I don't know how I missed that. Thank you very much for pointing that!!!! Flag can have only numeric value from 1 to 9, so I changed it to TINYINT(1)

>What kind of values go into the only CHAR (flag)? Perhaps utf8 is overkill?
Following previous correction I changed it to Latin1

Now it is what I have:
SHOW CREATE TABLE L2data\G -- engine, indexes;
*************************** 1. row ***************************
       Table: L2data
Create Table: CREATE TABLE `L2data` (
  `extensionid` smallint(6) NOT NULL,
  `flag` tinyint(1) NOT NULL,
  `offsetid` int(11) NOT NULL,
  `repeat` tinyint(4) NOT NULL,
  `siteid` int(11) NOT NULL,
  `time` decimal(10,6) NOT NULL,
  `value` float NOT NULL,
  `variableid` smallint(6) NOT NULL,
  KEY `indx` (`time`,`siteid`,`variableid`,`repeat`,`extensionid`,`offsetid`,`flag`,`value`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

SHOW TABLE STATUS LIKE 'L2data'\G -- sizes;
*************************** 1. row ***************************
           Name: L2data
         Engine: MyISAM
        Version: 10
     Row_format: Fixed
           Rows: 414169342
 Avg_row_length: 24
    Data_length: 9940064208
Max_data_length: 6755399441055743
   Index_length: 13761551360
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2011-07-14 15:13:00
    Update_time: 2011-07-14 15:17:53
     Check_time: 2011-07-14 21:06:55
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.00 sec)

EXPLAIN SELECT 'US-IB2' AS site, t0.time, t0.value AS APAR_0, t1.value AS APAR_fill_0, t2.value AS APARpct_0, t3.value AS APARpct_fill_0,         t4.value AS FH2O_0, t5.value AS FH2O_fill_0 FROM L2data t0 INNER JOIN  L2data t1 ON t0.time = t1.time INNER JOIN  L2data t2 ON  t0.time = t2.time INNER JOIN  L2data t3 ON t0.time = t3.time INNER JOIN  L2data t4 ON t0.time = t4.time INNER JOIN L2data t5 ON t0.time = t5.time WHERE t0.siteid = 47  AND t0.variableid = 1 AND t0.repeat = 0 AND t0.extensionid = 3  AND t0.offsetid = 3 AND t0.flag = 0 AND t1.siteid = 47  AND t1.variableid = 1 AND t1.repeat = 0 AND t1.extensionid = 1  AND t1.offsetid = 3 AND t1.flag = 0 AND t2.siteid = 47  AND t2.variableid = 6 AND t2.repeat = 0 AND t2.extensionid = 3  AND t2.offsetid = 3 AND t2.flag = 0 AND t3.siteid = 47  AND t3.variableid = 6 AND t3.repeat = 0 AND t3.extensionid = 1  AND t3.offsetid = 3 AND t3.flag = 0 AND t4.siteid = 47  AND t4.variableid = 5 AND t4.repeat = 0 AND t4.extensionid = 3  AND t4.offsetid = 3 AND t4.flag = 0 AND t5.siteid = 47  AND t5.variableid = 5 AND t5.repeat = 0 AND t5.extensionid = 1  AND t5.offsetid = 3 AND t5.flag = 0 ORDER BY t0.time;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t0
         type: index
possible_keys: indx
          key: indx
      key_len: 23
          ref: NULL
         rows: 1
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: ref
possible_keys: indx
          key: indx
      key_len: 19
          ref: Global.t0.time,const,const,const,const,const,const
         rows: 591679
        Extra: Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: ref
possible_keys: indx
          key: indx
      key_len: 19
          ref: Global.t0.time,const,const,const,const,const,const
         rows: 591679
        Extra: Using index
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
         type: ref
possible_keys: indx
          key: indx
      key_len: 19
          ref: Global.t2.time,const,const,const,const,const,const
         rows: 591679
        Extra: Using where; Using index
*************************** 5. row ***************************
           id: 1
  select_type: SIMPLE
        table: t4
         type: ref
possible_keys: indx
          key: indx
      key_len: 19
          ref: Global.t1.time,const,const,const,const,const,const
         rows: 591679
        Extra: Using where; Using index
*************************** 6. row ***************************
           id: 1
  select_type: SIMPLE
        table: t5
         type: ref
possible_keys: indx
          key: indx
      key_len: 19
          ref: Global.t0.time,const,const,const,const,const,const
         rows: 591679
        Extra: Using index
6 rows in set (0.00 sec)

Unfortunately to run the query above takes about 12 min and it isn't the most complicated one. It may have up to 25-30 variables and 'time from to' condition.
My idea about partitioning was: since 99% of my queries extract data for only one site, then, if I partition table by siteid, only queried partition is going to be involved and no need to search through the whole table. Basically the same effect I can achieve if I create separate table for each site, but I have 140 sites and ultimately going to deal with 3 types of data -> 420 tables. Isn't that too many? I don't know if it will provide any gain in performance considering all overhead to deal with so many tables.
Thank you very much for helping me!

Misha

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: 500 mil. rows table partitioning
1981
July 15, 2011 12:42PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.