MySQL :: Re: conflict resolution

New Topic

Re: conflict resolution

Posted by: Rick James
Date: June 30, 2012 09:40AM

Some definitions (for those listening in). (I hope I have them correct.)

In MySQL's "asynchronous" Replication (non-NDB) a write is committed on the Master before worrying about the Slave. When the write eventually arrives at the Slave, it is hoped to be possible. If the Slave is out of sync with the Master, the write could cause a "conflict" (such as "duplicate key" on a PRIMARY or UNIQUE index). In such a conflict, the Slave's replication thread simply hangs, waiting for human assistance. Note: None of the statements of an InnoDB transaction are sent to the Slaves until the COMMIT; at that point the entire transaction is sent. (Note: this leads to the need for the binlog buffer for the InnoDB statements, and any MyISAM statements in the middle of the BEGIN..COMMIT arrive at, and are executed on, the Slaves before any of the InnoDB statements.)

Newer MySQL versions (5.5?) have an optional "semi-sync". This does not finish the COMMIT on the Master (and hence does not reply to the client) until at least one Slave's "relay log" has received the write. This does not wait for that slave (or any other) to attempt the write, so duplicate key (and hang) are still possible. What semi-sync gives you is confidence that the transaction is at least copied to another machine. (However, see my comment about floods, below.)

Clustrix's nodes are tightly coupled, so a write to any node is (I think) committed on all nodes before returning from the COMMIT to the client.

Percona's Cluster is almost tightly coupled. Normally transactions will COMMIT, not ROLLBACK. (If this is not the case for your app, Percona may not perform well??) You start a transaction on any node; some of the activity is sent in parallel to the other nodes, letting them get started sooner than would happen with regular Replication. At the time of the COMMIT/ROLLBACK, there is some extra handshaking to finish syncing the rest of the transaction. This can lead do COMMIT failing on the Master, even though all the statements succeeded.

"Conflict resolution". In NDB Cluster, you set up rules for what to do if there is a conflict. You can write to any node. If you, say, INSERT the same unique key simultaneously on two nodes, they go through (commit), and both clients goes away happy. Eventually (very soon), the nodes notice the conflict and apply the rule you chose, thereby deciding which copy of the record to keep. Example rule: 'the one with the older timestamp wins'.

A big issue in all this... It takes time for nodes to talk to each other. Clustrix avoids these delays by requiring the nodes to be physically adjacent. NDB avoids it by saying that the data will be "eventually" consistent.

But-- if you are using Replication for "BCP", you really need a consistent Slave (or, better yet, warm Master) fully synced up, and remotely located so that a flood/earthquake/tornado/etc will not destroy (or make unavailable) all copies of your data simultaneously. The separation is over 100ms for a really remote site, or possibly 10ms for a nearby, but safely far enough, site. In a high-performance setup, even 10ms added to every write is totally unacceptable.

So, there is an "unsolvable" conflict between performance and a 100% synced backup ready for failover.

Another thing to note about all designed... Every write is performed on every machine. so, "write scaling" is essentially impossible. (Sharding is the solution.)

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

conflict resolution

2367

oded erner

June 25, 2012 08:31AM

Re: conflict resolution

937

Rick James

June 29, 2012 08:08AM

Re: conflict resolution

890

oded erner

July 01, 2012 02:55AM

Re: conflict resolution

984

Aftab Khan

June 29, 2012 08:31AM

Re: conflict resolution

1402

Rick James

June 30, 2012 09:40AM

Re: conflict resolution

1176

Aftab Khan

June 30, 2012 10:44AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.