MySQL Forums
Forum List  »  Partitioning

Re: Huge Dataset (perhaps?) made for Range Partitioning (Long)
Posted by: JinNo Kim
Date: September 22, 2005 04:16PM

Mikael Ronstrom wrote:
> Hi,
> It certainly sounds like an application suitable
> for partitioning.
> I presume you could do something like this
> LIST(DAYOFYEAR(C)+(YEAR(C)-2005)*366)
> Assuming that you want to start with the 121st day
> of the year.
> Then each new day you add a new partition with
> IN (121+day_number));
> Naturally range partitioning could be used as well
> and one can choose whether one wants
> daily partitions, weekly partitions or any other
> distance between them.
> The current 5.1 version has this syntax but the
> optimised version of this is still a few weeks
> away so don't try it on the really large tables
> yet.
> From looking at the table I get a few questions:
> 1) How important is the primary key. To be able to
> ensure primary key constraints it is
> necessary that the field used in the partition
> function is part of the primary key. Otherwise
> one essentially needs to check in each partition
> for uniqueness which gets a bit buurdensome
> when you have hundreds of partitions.
> So could the primary key be (A, C) instead
> assuming that C is used for the partition
> function.
> Rgrds Mikael

Thanks for the response, I'm hoping we'll be able to provide some
useful feedback to the community and a lab to test some aspects of
partitioning with a large dataset.

The primary key was originally only going to used to uniquely identify
rows in the 1 giant table for quick reference when we had identified a
transaction of interest thorugh other queries. It's no longer unique
across the daily MyISAM tables and I've been considering dropping it
to reclaim the 8 bytes/row.

One of the members of my team spent part of yesterday loading
another smaller server. We intend to attempt a build of 5.1 on it so
we'll be able to test any of the "what if scenarios" with a subset of the
existing tables. The new server only has (2) 146G disks, so we
striped them, giving about 200G for the DB which should allow us to
get a reasonably large dataset to play with.

I will drop 'A' on the tables created on the test server. It seemed
like a good idea at the time, but we are no longer able to use it and
it may impede progress. Would it be better to wait a bit before
starting or is the partitioning code ready to start experimenting
with say 500 million rows or so?

Thanks again,


Options: ReplyQuote

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.