As partitioning is soon to arrive in MySQL 5.1 for source downloads I'll start a little explanation
of what partitioning is all about and what potential uses it can have.
I'll start out with Range Partitioning:
I am eager to hear opinions on what are interesting features of Partitioning also so please post
your opinions. Now is a good time to do so since development is ongoing and is not set in stone.
Before explaining Range Partitioning it is necessary to set down some basic definitions first.
Definition of Partitioning in MySQL:
A partitioned table is an abstract table that implements a table by making use of one stored table
for each partition in the table. Theoretically each partition could use any storage engine. In the first
release all partitions must use the same storage engine.
Definition of Partition Function:
A partition function is a non-constant and non-random function of one or more fields in the table.
It cannot contain a query but can contain any scalar expression. In early versions of partitioning
the function will need to return an integer result. Best results are achieved if only one field is used
in the partition function and also the partition function should be relatively simple since it will be
evaluated very often in queries.
Definition of Range Partitioning:
Range Partitioning is used to define a partitioned table where each partition contains values in a
range. Ranges are non-overlapping. The only hole allowed is in values greater than the upper
range of the last partition.
Common usage of Range Partitioning:
Range Partitioning is a natural partitioning to use where time is involved in the partition function.
As an example a table might be partitioned on the year.
Partition 0: Before year 1990
Partition 1: 1991 - 1994
Partition 2: 1995 - 1998
Partition 3: 1999 - 2002
Partition 4: 2003 - 2005
So here we have 5 partitions. To create a table with such a partitioning scheme the following SQL
syntax could be used:
CREATE TABLE t1 (year int, int some_data)
PARTITION BY RANGE (year)
(PARTITION p0 VALUES LESS THAN (1991),
PARTITION p1 VALUES LESS THAN (1995),
PARTITION p2 VALUES LESS THAN (1999),
PARTITION p3 VALUES LESS THAN (2002),
PARTITION p4 VALUES LESS THAN (2006));
Now INSERT INTO t1 VALUES (1997, 1234) would place the record into partition p2.
Benefits:
The benefits of Range Partitioning is actually a couple of them:
1) Really large tables can be spread over several disks to ensure that bigger throughput on
queries on the table can be achieved
2) When data becomes old in this scheme, it can easily be swapped out by dropping the first
partition in the table. A simple ALTER TABLE DROP PARTITION p0 will dismiss of all data
generated before 1991.
3) Certain queries can be much more efficient such as:
SELECT SUM(some_data) FROM t1 WHERE year = 2002;
For this query the MySQL Server can discover that only partition p4 needs to be scanned,
partition p0, p1, p2, p3 can never contain any records satisfying the WHERE clause.
4) Partitioning in general lays a foundation for Parallel Query Support. As an example it is
very simple to parallelise the query:
SELECT SUM(some_data) WHERE some_data > 100;
This query can be executed in parallel on each partition and the total sum is the sum of the
query executed against each individual partition.
Mikael Ronstrom
Senior Software Architect, MySQL AB
My blog:
http://mikaelronstrom.blogspot.com
Edited 1 time(s). Last edit at 07/21/2005 09:45AM by Edwin DeSouza.