MySQL Forums
Forum List  »  Replication

Transaction inconsistency in Group Replication with AFTER mode
Posted by: Ankur Shukla
Date: November 16, 2023 03:54AM

Hi

Can someone explain following behavior? I expected that since paxos is used, it will not be possible that such data inconsistency occurs. However, I can constantly reproduce this behavior.

I have 3 node MySQL group replication cluster, with AFTER consistency.
When a node is abruptly shut down or disconnected from cluster, ongoing transactions on primary are rolled back due to certification failure. However, transactions are applied on secondary node. There are no writes done on secondary node. This will lead to data inconsistency issues wherein a client connected to master will see connection terminated from primary (if primary goes down), but the data would actually be present in cluster since secondary node had applied the transaction.

I verified this for following scenarios:
1. one of secondary nodes is shut down abruptly
2. primary is shut down abruptly

In both cases, secondary nodes had more transactions than primary. I verified his using gtid-executed. Eg:
Old: efa6f74a-73f0-11ee-8925-0a67e5184ce8:1,
fa4e6db0-3475-46c9-8e9c-2fab646ed636:1-1584925:2492207-2511313
New: efa6f74a-73f0-11ee-8925-0a67e5184ce8:1,
fa4e6db0-3475-46c9-8e9c-2fab646ed636:1-1584935:2492207-2511313


| group_replication_advertise_recovery_endpoints | DEFAULT |
| group_replication_allow_local_lower_version_join | OFF |
| group_replication_auto_increment_increment | 7 |
| group_replication_autorejoin_tries | 3 |
| group_replication_bootstrap_group | OFF |
| group_replication_clone_threshold | 9223372036854775807 |
| group_replication_communication_debug_options | GCS_DEBUG_NONE |
| group_replication_communication_max_message_size | 10485760 |
| group_replication_components_stop_timeout | 31536000 |
| group_replication_compression_threshold | 1000000 |
| group_replication_consistency | AFTER |
| group_replication_enforce_update_everywhere_checks | OFF |
| group_replication_exit_state_action | READ_ONLY |
| group_replication_flow_control_applier_threshold | 25000 |
| group_replication_flow_control_certifier_threshold | 25000 |
| group_replication_flow_control_hold_percent | 10 |
| group_replication_flow_control_max_quota | 0 |
| group_replication_flow_control_member_quota_percent | 0 |
| group_replication_flow_control_min_quota | 0 |
| group_replication_flow_control_min_recovery_quota | 0 |
| group_replication_flow_control_mode | QUOTA |
| group_replication_flow_control_period | 1 |
| group_replication_flow_control_release_percent | 50 |
| group_replication_force_members | |
| group_replication_group_name | fa4e6db0-3475-46c9-8e9c-2fab646ed636 |
| group_replication_group_seeds | 10.83.54.xx:33061,10.83.57.xx:33061,10.83.38.xx:33061 |
| group_replication_gtid_assignment_block_size | 1000000 |
| group_replication_ip_allowlist | 10.0.0.0/8 |
| group_replication_ip_whitelist | 10.0.0.0/8 |
| group_replication_local_address | 10.83.54.xx:33061 |
| group_replication_member_expel_timeout | 5 |
| group_replication_member_weight | 70 |
| group_replication_message_cache_size | 1073741824 |
| group_replication_poll_spin_loops | 0 |
| group_replication_recovery_complete_at | TRANSACTIONS_APPLIED |
| group_replication_recovery_compression_algorithms | uncompressed |
| group_replication_recovery_get_public_key | OFF |
| group_replication_recovery_public_key_path | |
| group_replication_recovery_reconnect_interval | 60 |
| group_replication_recovery_retry_count | 10 |
| group_replication_recovery_ssl_ca | |
| group_replication_recovery_ssl_capath | |
| group_replication_recovery_ssl_cert | |
| group_replication_recovery_ssl_cipher | |
| group_replication_recovery_ssl_crl | |
| group_replication_recovery_ssl_crlpath | |
| group_replication_recovery_ssl_key | |
| group_replication_recovery_ssl_verify_server_cert | OFF |
| group_replication_recovery_tls_ciphersuites | |
| group_replication_recovery_tls_version | TLSv1,TLSv1.1,TLSv1.2,TLSv1.3 |
| group_replication_recovery_use_ssl | OFF |
| group_replication_recovery_zstd_compression_level | 3 |
| group_replication_single_primary_mode | ON |
| group_replication_ssl_mode | DISABLED |
| group_replication_start_on_boot | OFF |
| group_replication_tls_source | MYSQL_MAIN |
| group_replication_transaction_size_limit | 150000000 |
| group_replication_unreachable_majority_timeout | 0 |
| innodb_replication_delay | 0 |
| replication_optimize_for_static_plugin_config | OFF |
| replication_sender_observe_commit_only | OFF

How to repeat:
1. Set up 3 node group replication cluster
2. set group_replication_consistency=AFTER on all nodes
3. kill mysql process on primary
4. check gtid_executed on old primary and new primary

Options: ReplyQuote


Subject
Views
Written By
Posted
Transaction inconsistency in Group Replication with AFTER mode
384
November 16, 2023 03:54AM


Sorry, only registered users may post in this forum.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.