MySQL :: Re: 500 mil. rows table partitioning

New Topic

Re: 500 mil. rows table partitioning

Posted by: Rick James
Date: July 15, 2011 07:59PM

> no need to search through the whole table
But you are currently not going through the whole table.

Are there really about 591679 rows matching
t0.time = t1.time AND t1.siteid = 47 AND t1.variableid = 1 AND t1.repeat = 0 AND t1.extensionid = 1 AND t1.offsetid = 3 AND t1.flag = 0
??
I would expect only 1. If it is only one, then the lookup will be quite efficient. At worst it will be about 1 disk hit.

> 420 tables. Isn't that too many?
No. But PARTITION might be cleaner. A million tables is very bad.

> 99% of my queries extract data for only one site
If you do not do partitioning, then this index has a similar effect (of localizing the disk blocks being accessed):
KEY (`siteid`,`time`, `variableid`,`repeat`,`extensionid`,`offsetid`,`flag`,`value`)
(I moved sited id to the start)

How many rows for the popular sideid? Perhaps about 591679?

I do not understand

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t0
         type: index
possible_keys: indx
          key: indx
      key_len: 23
          ref: NULL
         rows: 1
        Extra: Using where; Using index

It looks like it should scan through all the times for
t0.siteid = 47
AND t0.variableid = 1
AND t0.repeat = 0
AND t0.extensionid = 3
AND t0.offsetid = 3
AND t0.flag = 0
I would expect
key_len and rows to be the same as for the other tables.

I would expect the need for
1. scan 591679 rows for t0;
2. 5*591679 probes (Using index) into t1, etc.
That could take 12 minutes!

Is the output 591679 rows long? If not, please explain. (Maybe some times don't have all the requested values of repeat, extensionid, etc.)

I see repeat and flag being always zero in your example. Is this typical? If so, then rearranging to this might help further localize the data:
KEY (`siteid`,`repeat`,`flag`, `time`, `variableid`,`extensionid`,`offsetid`,`value`)

> key_buffer_size 1.6GB
> Index_length: 13GB
> 99% of my queries extract data for only one site
Putting those together, I would predict that nearly all the time, the desired index blocks will be in the key_buffer. Of course this assumes the index can be constructed to allow such. I feel that the ones we are trying can achieve that. So could PARTITIONing. So, I still have not found a performance reason for PARTITION.

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

500 mil. rows table partitioning

3931

Misha Krassovski

June 27, 2011 09:15AM

Re: 500 mil. rows table partitioning

2314

Rick James

July 01, 2011 06:22PM

Re: 500 mil. rows table partitioning

2160

Misha Krassovski

July 13, 2011 02:43PM

Re: 500 mil. rows table partitioning

1978

Rick James

July 14, 2011 12:51AM

Re: 500 mil. rows table partitioning

2105

Misha Krassovski

July 15, 2011 12:42PM

Re: 500 mil. rows table partitioning

2013

Rick James

July 15, 2011 07:59PM

Re: 500 mil. rows table partitioning

1972

Misha Krassovski

July 21, 2011 09:40AM

Re: 500 mil. rows table partitioning

2086

Rick James

July 21, 2011 08:46PM

Re: 500 mil. rows table partitioning

2197

Misha Krassovski

July 27, 2011 06:27AM

Re: 500 mil. rows table partitioning

1930

Rick James

July 30, 2011 06:14PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.