MySQL Forums
Forum List  »  NDB clusters

Mass crashes of MySQL APIs (mysqld).
Posted by: Daniel Schroers
Date: October 13, 2016 08:14AM

We are using the MySQL Cluster 7.4.9 open source edition. Downloads mysql-cluster-gpl-7.4.9-solaris10-x86_64.tar.gz repetitively mysql-cluster-gpl-7.4.9-solaris11-x86_64.tar.gz.

Our environment are 36 MySQL API (mysqld) on Solaris 10 an 24 MySQL API (mysqld) on Solaris 11. 16 MySQL Cluster Nodes (ndbmtd) all on Solaris 11 and two MySQL Cluster Manager (ndb_mgmd) on Solaris 10.

We are experiencing mass crashes of MySQL API (mysqld) from time to time. E.g. on October 8th, 2016 15 MySQL API (mysqld) between 21:23:17 and 21:23:18 UTC.

One example:

--

21:23:17 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

key_buffer_size=268435456
read_buffer_size=262144
max_used_connections=367
max_threads=3000
thread_count=265
connection_count=264
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2603761 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xa8ee99a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'my_print_stacktrace+0x1e [0x10c6576]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'handle_fatal_signal+0x305 [0xd413cd]
/lib/amd64/libc.so.1'__sighndlr+0x6 [0xffff80ffbf4e7f76]
/lib/amd64/libc.so.1'call_user_handler+0x2ce [0xffff80ffbf4dafce]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'__1cOexecute_commit6FpnHThd_ndb_pnONdbTransaction_iipI_i_+0x9a [0x13e75f6] [Signal 11 (SEGV)]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'__1cNha_ndbclusterQexec_bulk_update6MpI_i_+0x55 [0x13d061d]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'__1cMmysql_update6FpnDTHD_pnKTABLE_LIST_rnEList4nEItem___5pn0C_IpnIst_order_XnPenum_duplicates_bpX9C_i_+0x17f1 [0xe516d9]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'__1cVmysql_execute_command6FpnDTHD__i_+0x862 [0xdd1ffe]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'__1cLmysql_parse6FpnDTHD_pcIpnMParser_state__v_+0x33c [0xdd9cf4]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'__1cQdispatch_command6FnTenum_server_command_pnDTHD_pcI_b_+0xd9e [0xdcf3e6]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'__1cKdo_command6FpnDTHD__b_+0xca [0xdce4ae]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'__1cYdo_handle_one_connection6FpnDTHD__v_+0x157 [0xd9e2e7]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'handle_one_connection+0x47 [0xd9de33]
/usr/local/mysql-cluster-gpl-7.4.9-solaris11-x86_64/bin/mysqld'pfs_spawn_thread+0x14c [0x156c9a4]
/lib/amd64/libc.so.1'_thrp_setup+0xa5 [0xffff80ffbf4e7ba9]
/lib/amd64/libc.so.1'_lwp_start+0x0 [0xffff80ffbf4e7e50]
Please read http://dev.mysql.com/doc/refman/5.1/en/resolve-stack-dump.html
and follow instructions on how to resolve the stack trace.
Resolved stack trace is much more helpful in diagnosing the
problem, so please do resolve it

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (9abb7348): UPDATE alive SET last = UNIX_TIMESTAMP(), processnameid = 10256132,active=1 WHERE id = 10256132 AND @query:=0437
Connection ID (thread ID): 13123
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

--

The other 14 looking identical just different query's and/or mysql-cluster-gpl-7.4.9-solaris10-x86_64 in case of Solaris 10.

We do not know if it is just a coincidence, but we observed 'Got error 4010 when reading table' messages before the crash in > 90% in the minutes before the crashes. The log files of the MySQL API (mysqld) who didn't crash do not have these error at all.

Does someone can shed some light on these issue?

Options: ReplyQuote


Subject
Views
Written By
Posted
Mass crashes of MySQL APIs (mysqld).
1461
October 13, 2016 08:14AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.