Re: Sure: Bug with Encoding: \x80 \x81 ...
Thanks Mark,
>What character encoding is your connection set to use?
URL parameters do not play any role: it happens with and without
useUnicode=true&characterEncoding=UTF-8
(see another post
http://forums.mysql.com/read.php?39,154822,154822#msg-154822 - why do you need that?)
>What character encoding is your table and/or columns in the table?
utf8
[SHOW VARIABLES LIKE 'character_set%';] shows correct 'utf8' for all server, client, etc. (except binary file of course)
It is very strange reply from Server (with utf8 encoding): '\x80\x81'; looks like JDBC<->MySQL uses ASCII-encoded text. utf8-Euro-Sign can't be encoded as \x80. It MUST BE \x20AC instead...
>Please note that default-character-set is deprecated
I do not use it. I use single character-set-server=utf8 and it sets default utf8 to all the rest parameters.
I noticed same kind of errors happen with PHP clients... It's probably not JDBC related, but I don't know where happens the conversion between Java String and MySQL utf8... At least, JDBC driver could replace such characters with specifically designed symbol "Reverse Question Mark" before sending it to a server.
Temporary workaround: I'll use BLOB instead of TEXT, and I'll serialize Java String as a byte array (and probably additional field with encoding scheme).
The problem is very rare... I parsed and stored in a database millions of web pages, and a few have such bytes... It didn't happen with VARCHAR yet.