Our working theory was that the java garbage collector was preempting a
database-row-fetching thread at the wrong time, producing the Communication Link failure EOFException.
We tried setting the MySQL session variable NET_READ_TIMEOUT to 300 seconds (its default is 30 seconds) with the assumption that the garbage collector would never preempt a thread for more than five minutes, but it did not eliminate the error. (We did not also try to change the NET_WRITE_TIMEOUT variable -- I'd be interested to know if there were a set of MySQL configuration changes that would eliminate the error.)
After reverting the NET_READ_TIMEOUT change, we added the following options to the
java command with the goal of keeping the garbage collector from pausing other threads for longer than 25 seconds:
-XX:MaxGCPauseMillis=25000
-XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode
-XX:+CMSIncrementalPacing
-XX:CMSIncrementalDutyCycleMin=0
-XX:CMSIncrementalDutyCycle=10
and it eliminated the error.
(See
http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html for a detailed description of the options.)