Hi,
Michael Moores wrote:
> My question regards the scalability of
> update/insert operations across
> a cluster. My background is with Oracle, and I
> know that with Oracle RAC you can't scale much
> beyond 4-5 nodes because of the data
> synchronization overhead between nodes.
> Maintaining read consistency across nodes can be
> very expensive when the data is dirtied by one
> node and then
> queried again by another node.
>
> If I intend to scale Oracle RAC beyond a few
> nodes, we would partition the requests based on a
> partition key, instead of spraying requests
> randomly across nodes. So we would essentially
> force each node to cache a known parition of data
> in memory.
> This results in inserts/updates and queries being
> directed to a particular pair of nodes (one
> primary and one failover), and this dramatically
> reduces data contention across the network when
> multiple nodes have copies of the same data in
> cache. Oracle asserts that we can scale for
> beyond a few nodes if we take this approach.
>
> Should I consider the same approach with MySQL
> cluster?
> What should I be thinking about here if I want to
> consider MySQL cluster?
>
Yes, MySQL Cluster so benefits highly from users accessing known
partitions, currently partitions are automatically assigned, in 5.1
partitioning can also be user defined. Currently partitioning is done by
applying an MD5 function on the primary key and some linear hash
logic to get the partition to use.
With this kind of separation of the application you get more or less linear
scalability with MySQL Cluster.
Rgrds Mikael
> Thanks,
> --Michael
Mikael Ronstrom
Senior Software Architect, MySQL AB
My blog:
http://mikaelronstrom.blogspot.com