Transaction inconsistency in Group Replication with AFTER mode
Posted by: Ankur Shukla
Date: November 16, 2023 03:54AM
Date: November 16, 2023 03:54AM
Hi
Can someone explain following behavior? I expected that since paxos is used, it will not be possible that such data inconsistency occurs. However, I can constantly reproduce this behavior.
I have 3 node MySQL group replication cluster, with AFTER consistency.
When a node is abruptly shut down or disconnected from cluster, ongoing transactions on primary are rolled back due to certification failure. However, transactions are applied on secondary node. There are no writes done on secondary node. This will lead to data inconsistency issues wherein a client connected to master will see connection terminated from primary (if primary goes down), but the data would actually be present in cluster since secondary node had applied the transaction.
I verified this for following scenarios:
1. one of secondary nodes is shut down abruptly
2. primary is shut down abruptly
In both cases, secondary nodes had more transactions than primary. I verified his using gtid-executed. Eg:
Old: efa6f74a-73f0-11ee-8925-0a67e5184ce8:1,
fa4e6db0-3475-46c9-8e9c-2fab646ed636:1-1584925:2492207-2511313
New: efa6f74a-73f0-11ee-8925-0a67e5184ce8:1,
fa4e6db0-3475-46c9-8e9c-2fab646ed636:1-1584935:2492207-2511313
| group_replication_advertise_recovery_endpoints | DEFAULT |
| group_replication_allow_local_lower_version_join | OFF |
| group_replication_auto_increment_increment | 7 |
| group_replication_autorejoin_tries | 3 |
| group_replication_bootstrap_group | OFF |
| group_replication_clone_threshold | 9223372036854775807 |
| group_replication_communication_debug_options | GCS_DEBUG_NONE |
| group_replication_communication_max_message_size | 10485760 |
| group_replication_components_stop_timeout | 31536000 |
| group_replication_compression_threshold | 1000000 |
| group_replication_consistency | AFTER |
| group_replication_enforce_update_everywhere_checks | OFF |
| group_replication_exit_state_action | READ_ONLY |
| group_replication_flow_control_applier_threshold | 25000 |
| group_replication_flow_control_certifier_threshold | 25000 |
| group_replication_flow_control_hold_percent | 10 |
| group_replication_flow_control_max_quota | 0 |
| group_replication_flow_control_member_quota_percent | 0 |
| group_replication_flow_control_min_quota | 0 |
| group_replication_flow_control_min_recovery_quota | 0 |
| group_replication_flow_control_mode | QUOTA |
| group_replication_flow_control_period | 1 |
| group_replication_flow_control_release_percent | 50 |
| group_replication_force_members | |
| group_replication_group_name | fa4e6db0-3475-46c9-8e9c-2fab646ed636 |
| group_replication_group_seeds | 10.83.54.xx:33061,10.83.57.xx:33061,10.83.38.xx:33061 |
| group_replication_gtid_assignment_block_size | 1000000 |
| group_replication_ip_allowlist | 10.0.0.0/8 |
| group_replication_ip_whitelist | 10.0.0.0/8 |
| group_replication_local_address | 10.83.54.xx:33061 |
| group_replication_member_expel_timeout | 5 |
| group_replication_member_weight | 70 |
| group_replication_message_cache_size | 1073741824 |
| group_replication_poll_spin_loops | 0 |
| group_replication_recovery_complete_at | TRANSACTIONS_APPLIED |
| group_replication_recovery_compression_algorithms | uncompressed |
| group_replication_recovery_get_public_key | OFF |
| group_replication_recovery_public_key_path | |
| group_replication_recovery_reconnect_interval | 60 |
| group_replication_recovery_retry_count | 10 |
| group_replication_recovery_ssl_ca | |
| group_replication_recovery_ssl_capath | |
| group_replication_recovery_ssl_cert | |
| group_replication_recovery_ssl_cipher | |
| group_replication_recovery_ssl_crl | |
| group_replication_recovery_ssl_crlpath | |
| group_replication_recovery_ssl_key | |
| group_replication_recovery_ssl_verify_server_cert | OFF |
| group_replication_recovery_tls_ciphersuites | |
| group_replication_recovery_tls_version | TLSv1,TLSv1.1,TLSv1.2,TLSv1.3 |
| group_replication_recovery_use_ssl | OFF |
| group_replication_recovery_zstd_compression_level | 3 |
| group_replication_single_primary_mode | ON |
| group_replication_ssl_mode | DISABLED |
| group_replication_start_on_boot | OFF |
| group_replication_tls_source | MYSQL_MAIN |
| group_replication_transaction_size_limit | 150000000 |
| group_replication_unreachable_majority_timeout | 0 |
| innodb_replication_delay | 0 |
| replication_optimize_for_static_plugin_config | OFF |
| replication_sender_observe_commit_only | OFF
How to repeat:
1. Set up 3 node group replication cluster
2. set group_replication_consistency=AFTER on all nodes
3. kill mysql process on primary
4. check gtid_executed on old primary and new primary
Can someone explain following behavior? I expected that since paxos is used, it will not be possible that such data inconsistency occurs. However, I can constantly reproduce this behavior.
I have 3 node MySQL group replication cluster, with AFTER consistency.
When a node is abruptly shut down or disconnected from cluster, ongoing transactions on primary are rolled back due to certification failure. However, transactions are applied on secondary node. There are no writes done on secondary node. This will lead to data inconsistency issues wherein a client connected to master will see connection terminated from primary (if primary goes down), but the data would actually be present in cluster since secondary node had applied the transaction.
I verified this for following scenarios:
1. one of secondary nodes is shut down abruptly
2. primary is shut down abruptly
In both cases, secondary nodes had more transactions than primary. I verified his using gtid-executed. Eg:
Old: efa6f74a-73f0-11ee-8925-0a67e5184ce8:1,
fa4e6db0-3475-46c9-8e9c-2fab646ed636:1-1584925:2492207-2511313
New: efa6f74a-73f0-11ee-8925-0a67e5184ce8:1,
fa4e6db0-3475-46c9-8e9c-2fab646ed636:1-1584935:2492207-2511313
| group_replication_advertise_recovery_endpoints | DEFAULT |
| group_replication_allow_local_lower_version_join | OFF |
| group_replication_auto_increment_increment | 7 |
| group_replication_autorejoin_tries | 3 |
| group_replication_bootstrap_group | OFF |
| group_replication_clone_threshold | 9223372036854775807 |
| group_replication_communication_debug_options | GCS_DEBUG_NONE |
| group_replication_communication_max_message_size | 10485760 |
| group_replication_components_stop_timeout | 31536000 |
| group_replication_compression_threshold | 1000000 |
| group_replication_consistency | AFTER |
| group_replication_enforce_update_everywhere_checks | OFF |
| group_replication_exit_state_action | READ_ONLY |
| group_replication_flow_control_applier_threshold | 25000 |
| group_replication_flow_control_certifier_threshold | 25000 |
| group_replication_flow_control_hold_percent | 10 |
| group_replication_flow_control_max_quota | 0 |
| group_replication_flow_control_member_quota_percent | 0 |
| group_replication_flow_control_min_quota | 0 |
| group_replication_flow_control_min_recovery_quota | 0 |
| group_replication_flow_control_mode | QUOTA |
| group_replication_flow_control_period | 1 |
| group_replication_flow_control_release_percent | 50 |
| group_replication_force_members | |
| group_replication_group_name | fa4e6db0-3475-46c9-8e9c-2fab646ed636 |
| group_replication_group_seeds | 10.83.54.xx:33061,10.83.57.xx:33061,10.83.38.xx:33061 |
| group_replication_gtid_assignment_block_size | 1000000 |
| group_replication_ip_allowlist | 10.0.0.0/8 |
| group_replication_ip_whitelist | 10.0.0.0/8 |
| group_replication_local_address | 10.83.54.xx:33061 |
| group_replication_member_expel_timeout | 5 |
| group_replication_member_weight | 70 |
| group_replication_message_cache_size | 1073741824 |
| group_replication_poll_spin_loops | 0 |
| group_replication_recovery_complete_at | TRANSACTIONS_APPLIED |
| group_replication_recovery_compression_algorithms | uncompressed |
| group_replication_recovery_get_public_key | OFF |
| group_replication_recovery_public_key_path | |
| group_replication_recovery_reconnect_interval | 60 |
| group_replication_recovery_retry_count | 10 |
| group_replication_recovery_ssl_ca | |
| group_replication_recovery_ssl_capath | |
| group_replication_recovery_ssl_cert | |
| group_replication_recovery_ssl_cipher | |
| group_replication_recovery_ssl_crl | |
| group_replication_recovery_ssl_crlpath | |
| group_replication_recovery_ssl_key | |
| group_replication_recovery_ssl_verify_server_cert | OFF |
| group_replication_recovery_tls_ciphersuites | |
| group_replication_recovery_tls_version | TLSv1,TLSv1.1,TLSv1.2,TLSv1.3 |
| group_replication_recovery_use_ssl | OFF |
| group_replication_recovery_zstd_compression_level | 3 |
| group_replication_single_primary_mode | ON |
| group_replication_ssl_mode | DISABLED |
| group_replication_start_on_boot | OFF |
| group_replication_tls_source | MYSQL_MAIN |
| group_replication_transaction_size_limit | 150000000 |
| group_replication_unreachable_majority_timeout | 0 |
| innodb_replication_delay | 0 |
| replication_optimize_for_static_plugin_config | OFF |
| replication_sender_observe_commit_only | OFF
How to repeat:
1. Set up 3 node group replication cluster
2. set group_replication_consistency=AFTER on all nodes
3. kill mysql process on primary
4. check gtid_executed on old primary and new primary
Subject
Views
Written By
Posted
Transaction inconsistency in Group Replication with AFTER mode
384
November 16, 2023 03:54AM
Sorry, only registered users may post in this forum.
Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.