MySQL :: Re: 500 mil. rows table partitioning

New Topic

Re: 500 mil. rows table partitioning

Posted by: Rick James
Date: July 01, 2011 06:22PM

Wrong questions. The answer is "key-value schema is really bad on performance!".

OK, let's see what we can do...

The pattern seems to be:

INNER JOIN L2data_part t4 ON t0.time = t4.time
...
      AND  t4.siteid = 47
      AND  t4.variableid = 5
      AND  t4.repeat = 0
      AND  t4.extensionid = 3
      AND  t4.offsetid = 3
      AND  t4.flag = 0

So,
INDEX(time, siteid, variableid, repeat, extensionid, offsetid, flag)
is the first guess. But, since you only want the `value`, then
INDEX(time, siteid, variableid, repeat, extensionid, offsetid, flag, value)
would be better -- then it could do the lookup entirely in the index. (The order of the fields is not critical, except that `value` must be last).

please provide
* SHOW CREATE TABLE tbl\G -- engine, indexes (DESCRIBE is lousy)
* SHOW TABLE STATUS LIKE 'tbl'\G -- sizes
* EXPLAIN SELECT ...\G -- clues of inefficiencies
* SHOW VARIABLES LIKE '%buffer%'; -- cache size
and surround them with [ code ] and [ / code ]
How much RAM do you have?

> time | decimal(10,6)
How is time represented??

Back to slamming key-value. You say
> variableids can be up to 83
Does that mean that the data could be a single table with about 83 columns? Looks like it would have `time` as the PRIMARY KEY. Is there some reason not to do that? It would be soooo much faster.

> APAR_0 ... APARpct_0
Could one of those be derived from the other?

Back to indexing the existing table. It might be better to use InnoDB, and have
PRIMARY KEY(time, siteid, variableid, repeat, extensionid, offsetid, flag)
(without `value`). Normally, one wants a PRIMARY KEY to be the minimal set of fields that uniquely identify the row. Are all 7 of those fields needed for such?

> 500 mil. records and will grow
Have you ever tried to add an index to a table that big? Plan on an hour. And do it now, not later.

How much disk do you have? How big is the table now? Allow 2x-4x the disk space after adding an index and/or changing to InnoDB.

Converting to a single table with 83+ columns will shrink the data significantly because you can get rid of variableid, etc. And you can make use TINYINT UNSIGNED (1 byte) instead of FLOAT (4 bytes) for some of the `value`s.

Will you be dropping "old" data? That is the main use for PARTITION. Anyway, let's finish discussing the schema before discussing PARTITIONing.

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

500 mil. rows table partitioning

3701

Misha Krassovski

June 27, 2011 09:15AM

Re: 500 mil. rows table partitioning

2212

Rick James

July 01, 2011 06:22PM

Re: 500 mil. rows table partitioning

2034

Misha Krassovski

July 13, 2011 02:43PM

Re: 500 mil. rows table partitioning

1882

Rick James

July 14, 2011 12:51AM

Re: 500 mil. rows table partitioning

1982

Misha Krassovski

July 15, 2011 12:42PM

Re: 500 mil. rows table partitioning

1901

Rick James

July 15, 2011 07:59PM

Re: 500 mil. rows table partitioning

1864

Misha Krassovski

July 21, 2011 09:40AM

Re: 500 mil. rows table partitioning

1999

Rick James

July 21, 2011 08:46PM

Re: 500 mil. rows table partitioning

2089

Misha Krassovski

July 27, 2011 06:27AM

Re: 500 mil. rows table partitioning

1816

Rick James

July 30, 2011 06:14PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.