MySQL :: Re: partitioning by highly skewed data

New Topic

Re: partitioning by highly skewed data

Posted by: Mikael Ronström
Date: June 18, 2005 02:16AM

Hi,

Hiroshi Takenaka wrote:
> Hi,
>
> Thank you for the response.
>
> Mikael Ronstrom wrote:
> >Subpartitioning is possible for RANGE and LIST
> partitions, the
> >subpartitioning is then HASH partitioning.
>
> It would be very nice if we can subpartition HASH
> partitions,
> or do somthing like:
>
> : create table t1 (high_skew_col int,
> some_low_skew_col int,...)
> : partitioned by
> range(hashfunc(high_skew_col) mod n) ...
> : subpartitioned by
> hash(hashfunc(some_low_skew_col))...;
>
> and distribute subpartitions among several disks.
> Perhaps, it will give us maximum performance, if
> future version of
> MySQL can do parallel scanning of partitions ;-)
>

Yes, exactly this will be possibly already in the first alpha version.
Yes, parallel scanning of partitions is planned for in a not too distant
future. Also many other parallel parts are planned for although we will
focus first on making partitions easy to maintain.

> In this example, I expect that we can still enjoy
> the benefit of
> (hash)partitioning on queries like 'select * from
> t1 where high_skew_col = X',
> while we can distribute data more evenly among
> disks.
>

Yes, the partitioned handler will be able to detect that only a subset of the partitions need
to be checked (in the first version not in every situation but this will improve over time as
the optimiser part is further developed)

> RANGE partitioning by high_skew_col +
> subpartitioning is not very
> suitable to our situation, because our
> high_skew_col is actually
> sparse and unevenly distributed id numbers.
>

If they are not so many you could also contemplate using LIST partitioning.

> When I wrote my previous post, my idea was to
> define
> a custom hash function like:
>
> : fn(val1, val2) mod n = hash(val2) mod k
> (if val1 is null)
> : = hash(val1)
> mod (n -k) + k (if val1 is not null)
>
> and use 'partitioned by hash(high_skew_col,
> some_low_skew_col)'.
>

PARTITION BY KEY (low_skew_col, high_skew_col)
would achieve more or less this purpose (uses a MySQL
defined hash function that takes NULL values into account.

> But, of course, it's not very useful if we rarely
> use some_low_skew_col
> in where clause. I must have been sleeping...
>

Thx for the input, I added support for NULL values in partition functions yesterday.

Rgrds Mikael

> Hiroshi

Mikael Ronstrom
Senior Software Architect, MySQL AB
My blog: http://mikaelronstrom.blogspot.com

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

partitioning by highly skewed data

4583

Hiroshi Takenaka

June 16, 2005 07:58AM

Re: partitioning by highly skewed data

3018

Mikael Ronström

June 16, 2005 11:57AM

Re: partitioning by highly skewed data

3112

Hiroshi Takenaka

June 17, 2005 10:45AM

Re: partitioning by highly skewed data

3030

Mikael Ronström

June 18, 2005 02:16AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.