MySQL Forums
Forum List  »  InnoDB clusters

Innodb Cluster Issues
Posted by: Joe Ritz
Date: May 16, 2018 08:17AM

I have setup a MySQL Cluster using 5.7.21 server, MySQL shell 1.0.11, and MySQL Router 2.1.5. I am able to create a new cluster and bootstrap the router. Everything says that it is online. I am able to kill the master and it switches to one of the slaves that are setup. I am able to bring the down node back as a slave without issue.

Where the issues come in is when all of the nodes go down. Once I try to bring back the MySQL server processes, they will not rejoin the cluster for anything. If I log into node1 (using mysqlsh --uri), and I try to do a dba.getCluster('clusterName').status() it tells me that I can not do this from a standalone process. If I try to do a dba.rebootClusterFromCompleteOutage() it asks if I want to remove the other nodes from the metadata. From the best that I can tell, node1 can't talk to the other nodes because it can't reach node2.address:13306. That thread wont come up for some unknown reason. This whole thing seems to be super brittle and REALLY FRUSTRATING!!!! We had a situation where our master node filled up the local disk. MySQL Router didn't roll to the next node and make it master. They after rebooting one MySQL server at a time, they didn't rejoin the cluster. Their states were out of sync between all 3 nodes. We had to export the data from our old master. Blow away the installation and reinstall. That process is completely unacceptable.

I have to believe that MySQL wouldn't release such a product, so it has to be the order in which I am doing things or the way I have each node configured. So, I will list out the .cnf file from one of my nodes.

What I would like is for each MySQL server to restart and as each one is restarted, they are added back into the cluster. Having to do the dba.rebootClusterFromCompleteOutage() is total BS from a devOps perspective. If there is a discrepancy between the databases, I would like a way to tell the cluster to take the data from a specific node and not worry about what transactions are missing.

Below is the .cnf file from node1

[mysqld]

server-id=1
tmpdir=/localdrive/tmp
basedir=/sharedrive/mysql
datadir=/localdrive/data
plugin-dir=sharedrive/mysql/lib/plugin
port=3306
log-error=/localdrive/logs/error.log
slow-query-log=TRUE
slow-query-log-file=/localdrive/logs/slow_query.log
socket=/localdrive/mysql.sock
master-info-repository=TABLE
relay-log-info-respository=TABLE
transaction-write-set-extraction=XXHASH64
log-bin=1
binlog-checksum=NONE
log-slave-updates=ON
gtid-mode=ON
enforce-gtid-consistency=ON
binlog_format=ROW
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE
report_port=3306
group_replication_start_on_boot = ON
group_replication = ON
group_replication_ip_whitelist=node1,node2,node3
group_replication_allow_local_disjoint_gtids_join = OFF
group_replication_allow_local_lower_version_join = OFF
group_replication_auto_increment_increment=7
group_replication_bootstrap_group = OFF
group_replication_components_stop_timeout = 31536000
group_replication_compression_threshold = 1000000
group_replication_enforce_update_everywhere_checks = OFF
group_replication_flow_control_applier_threshold = 25000
group_replication_flow_control_certifier_threshold = 25000
group_replication_flow_control_mode = QUOTA
group_replication_force_members
group_replication_group_name = 94ebbf27-5832-00a5-0ee2-c0072ab4597a
group_replication_group_seeds = node2,node3
group_replication_gtid_assignment_block_size = 10000000
group_replication_local_address = node1:13306
group_replication_member_weight = 50
group_replication_poll_spin_loops = 0
group_replication_recovery_complete_at = TRANSACTIONS_APPLIED
group_replication_recovery_reconnect_interval = 60
group_replication_recovery_retry_count = 10
group_replication_recovery_ssl_ca
group_replication_recovery_ssl_capath
group_replication_recovery_ssl_cert
group_replication_recovery_ssl_cipher
group_replication_recovery_ssl_crl
group_replication_recovery_ssl_crl_path
group_replication_recovery_ssl_key
group_replication_recovery_ssl_verify_server_cert = OFF
group_replication_recovery_use_ssl = OFF
group_replication_single_primary_mode = ON
group_replication_ssl_mode = DISABLED
group_replication_transaction_size_limit = 0
group_replication_unreachable_majority_timeout = 0
auto_increment_increment = 1
auto_increment_offset = 2

node 2

[mysqld]

server-id=2
tmpdir=/localdrive/tmp
basedir=/sharedrive/mysql
datadir=/localdrive/data
plugin-dir=sharedrive/mysql/lib/plugin
port=3306
log-error=/localdrive/logs/error.log
slow-query-log=TRUE
slow-query-log-file=/localdrive/logs/slow_query.log
socket=/localdrive/mysql.sock
master-info-repository=TABLE
relay-log-info-respository=TABLE
transaction-write-set-extraction=XXHASH64
log-bin=1
binlog-checksum=NONE
log-slave-updates=ON
gtid-mode=ON
enforce-gtid-consistency=ON
binlog_format=ROW
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE
report_port=3306
group_replication_start_on_boot = ON
group_replication = ON
group_replication_ip_whitelist=node1,node2,node3
group_replication_allow_local_disjoint_gtids_join = OFF
group_replication_allow_local_lower_version_join = OFF
group_replication_auto_increment_increment=7
group_replication_bootstrap_group = OFF
group_replication_components_stop_timeout = 31536000
group_replication_compression_threshold = 1000000
group_replication_enforce_update_everywhere_checks = OFF
group_replication_flow_control_applier_threshold = 25000
group_replication_flow_control_certifier_threshold = 25000
group_replication_flow_control_mode = QUOTA
group_replication_force_members
group_replication_group_name = 94ebbf27-5832-00a5-0ee2-c0072ab4597a
group_replication_group_seeds = node1
group_replication_gtid_assignment_block_size = 10000000
group_replication_local_address = node2:13306
group_replication_member_weight = 50
group_replication_poll_spin_loops = 0
group_replication_recovery_complete_at = TRANSACTIONS_APPLIED
group_replication_recovery_reconnect_interval = 60
group_replication_recovery_retry_count = 10
group_replication_recovery_ssl_ca
group_replication_recovery_ssl_capath
group_replication_recovery_ssl_cert
group_replication_recovery_ssl_cipher
group_replication_recovery_ssl_crl
group_replication_recovery_ssl_crl_path
group_replication_recovery_ssl_key
group_replication_recovery_ssl_verify_server_cert = OFF
group_replication_recovery_use_ssl = OFF
group_replication_single_primary_mode = ON
group_replication_ssl_mode = DISABLED
group_replication_transaction_size_limit = 0
group_replication_unreachable_majority_timeout = 0
auto_increment_increment = 1
auto_increment_offset = 2

node 3

[mysqld]

server-id=3
tmpdir=/localdrive/tmp
basedir=/sharedrive/mysql
datadir=/localdrive/data
plugin-dir=sharedrive/mysql/lib/plugin
port=3306
log-error=/localdrive/logs/error.log
slow-query-log=TRUE
slow-query-log-file=/localdrive/logs/slow_query.log
socket=/localdrive/mysql.sock
master-info-repository=TABLE
relay-log-info-respository=TABLE
transaction-write-set-extraction=XXHASH64
log-bin=1
binlog-checksum=NONE
log-slave-updates=ON
gtid-mode=ON
enforce-gtid-consistency=ON
binlog_format=ROW
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE
report_port=3306
group_replication_start_on_boot = ON
group_replication = ON
group_replication_ip_whitelist=node1,node2,node3
group_replication_allow_local_disjoint_gtids_join = OFF
group_replication_allow_local_lower_version_join = OFF
group_replication_auto_increment_increment=7
group_replication_bootstrap_group = OFF
group_replication_components_stop_timeout = 31536000
group_replication_compression_threshold = 1000000
group_replication_enforce_update_everywhere_checks = OFF
group_replication_flow_control_applier_threshold = 25000
group_replication_flow_control_certifier_threshold = 25000
group_replication_flow_control_mode = QUOTA
group_replication_force_members
group_replication_group_name = 94ebbf27-5832-00a5-0ee2-c0072ab4597a
group_replication_group_seeds = node1
group_replication_gtid_assignment_block_size = 10000000
group_replication_local_address = node3:13306
group_replication_member_weight = 50
group_replication_poll_spin_loops = 0
group_replication_recovery_complete_at = TRANSACTIONS_APPLIED
group_replication_recovery_reconnect_interval = 60
group_replication_recovery_retry_count = 10
group_replication_recovery_ssl_ca
group_replication_recovery_ssl_capath
group_replication_recovery_ssl_cert
group_replication_recovery_ssl_cipher
group_replication_recovery_ssl_crl
group_replication_recovery_ssl_crl_path
group_replication_recovery_ssl_key
group_replication_recovery_ssl_verify_server_cert = OFF
group_replication_recovery_use_ssl = OFF
group_replication_single_primary_mode = ON
group_replication_ssl_mode = DISABLED
group_replication_transaction_size_limit = 0
group_replication_unreachable_majority_timeout = 0
auto_increment_increment = 1
auto_increment_offset = 2

Thank you

Options: ReplyQuote


Subject
Views
Written By
Posted
Innodb Cluster Issues
2216
May 16, 2018 08:17AM
742
May 16, 2018 11:45AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.