MySQL :: Re: How to handle secondary keys in partitioning?

New Topic

Re: How to handle secondary keys in partitioning?

Posted by: Mattias Jonsson
Date: May 05, 2010 02:20AM

OK in this case it depends on how many partitions and how many matching rows there are for 'SELECT * FROM tp WHERE sender_id=1 ORDER BY date'.

Since it is partitioned by recipient_id pruning will not kick in.

Lets say you have N partitions and M matching rows (sender_id = 1).

This means that it has to do a index ordered read from each partition, and merge sort all N rows. And while there is still matching rows a new index ordered read would be done, until all matching rows are found.

So if (M < N) i.e. more partitions than matching rows, it will need to do more index lookups (and the sum of all steps in the b-tree index will be higher).

But if (M > N) i.e. more matching rows than partitions, the number of index reads will be the same (but the sum of the b-tree depth will be slightly higher, and there will be an additional sorting cost).

So there is an overhead of index search in partitioning, since the indexes are partitioned just like the data, but this overhead is higher per row if there are more partitions than matching rows and shrinks per row as the matching row number increases.

And the cost for sorting is also added in case of partitioning.

The type of performance degradation is hard to estimate without benchmarking, it depends on how many partitions and how many matching rows are found, as well as used engine and if a covering index can be used etc.

What I would suggest is that you benchmark your load on both partitioned and non partitioned tables. And if possible, add the results to this topic!

(Note that this is only index reads we discuss, the cost for index writes should be lower for partitioned tables, since the b-tree depth is lower per partition than it would be for a non partitioned table.)

If you usually query recipient_id and sender_id independently, I would suggest you to try to also subpartition your data, perhaps like:
PARTITION BY LIST (recipient_id % 5)
SUBPARTITION BY HASH (sender_id) PARTITIONS 5
(PARTITION pREC_0 VALUES IN (0),
PARTITION pREC_1 VALUES IN (1),
PARTITION pREC_2 VALUES IN (2),
PARTITION pREC_3 VALUES IN (3),
PARTITION pREC_4 VALUES IN (4))

That way you have partitioned on both recipient_id and sender_id equally, and have a total of 25 partitions. (you could also use RANGE instead of LIST, but I used LIST to get an easy way to equally distribute both columns.)

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

How to handle secondary keys in partitioning?

4150

John Smith

May 04, 2010 12:46AM

Re: How to handle secondary keys in partitioning?

2346

Mattias Jonsson

May 04, 2010 01:29PM

Re: How to handle secondary keys in partitioning?

1984

John Smith

May 04, 2010 04:30PM

Re: How to handle secondary keys in partitioning?

2185

Mattias Jonsson

May 05, 2010 02:20AM

Re: How to handle secondary keys in partitioning?

2047

Rick James

May 06, 2010 12:04AM

Re: How to handle secondary keys in partitioning?

2443

Mattias Jonsson

May 06, 2010 03:23PM

Re: How to handle secondary keys in partitioning?

2088

John Smith

May 06, 2010 10:50PM

Re: How to handle secondary keys in partitioning?

2061

Mattias Jonsson

May 07, 2010 02:38AM

Re: How to handle secondary keys in partitioning?

2019

John Smith

May 08, 2010 05:31PM

Re: How to handle secondary keys in partitioning?

2076

Mattias Jonsson

May 09, 2010 04:37AM

Re: How to handle secondary keys in partitioning?

1955

John Smith

May 09, 2010 12:41PM

Re: How to handle secondary keys in partitioning?

1925

Mattias Jonsson

May 09, 2010 01:23PM

Re: How to handle secondary keys in partitioning?

2027

Mattias Jonsson

May 12, 2010 05:29AM

Re: How to handle secondary keys in partitioning?

2481

John Smith

May 06, 2010 12:34AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.