I am struggling to find a zero-downtime setup for a MySQL InnoDB cluster.
As a lab installation of MySQL Router in front of a three-node MySQL InnoDB cluster, I use the stack at
https://github.com/garutilorenzo/mysql-innodb-cluster
This works wonderfully, with one exception:
If I test a MySQL primary node switch under (heavy) write-load, clients get the error message "SQLSTATE[HY000]: General error: 1290 The MySQL server is running with the --read-only option so it cannot execute this statement"
From what I can tell, this specific error message is caused by an architectural problem in how this whole system is set up: MySQL Router is configured with a ttl, i.e. repeatedly _polls_ cluster for its (primary/secondary) topology. Once it has learnt about any change to the primary, it does redirect - but not before clients get the error message above.
Is there any means to have a server-side architecture which prevents this
* with MySQL Router
* using some alternative architecture
?
One of the implications of the current set up that a primary node switch will always disrupt clients, in a rather annoying way.
(On the client side, dropping that connection and retrying is the obvious solution; I am interested in making the server-side architecture friendlier towards clients.)
Many thanks!