Re: Sure: Bug with Encoding: \x80 \x81 ...
Posted by: Bambarbia Kirkudu
Date: May 31, 2007 08:47AM

Thanks Mark,

>What character encoding is your connection set to use?
URL parameters do not play any role: it happens with and without
(see another post,154822,154822#msg-154822 - why do you need that?)

>What character encoding is your table and/or columns in the table?
[SHOW VARIABLES LIKE 'character_set%';] shows correct 'utf8' for all server, client, etc. (except binary file of course)

It is very strange reply from Server (with utf8 encoding): '\x80\x81'; looks like JDBC<->MySQL uses ASCII-encoded text. utf8-Euro-Sign can't be encoded as \x80. It MUST BE \x20AC instead...

>Please note that default-character-set is deprecated
I do not use it. I use single character-set-server=utf8 and it sets default utf8 to all the rest parameters.

I noticed same kind of errors happen with PHP clients... It's probably not JDBC related, but I don't know where happens the conversion between Java String and MySQL utf8... At least, JDBC driver could replace such characters with specifically designed symbol "Reverse Question Mark" before sending it to a server.

Temporary workaround: I'll use BLOB instead of TEXT, and I'll serialize Java String as a byte array (and probably additional field with encoding scheme).

The problem is very rare... I parsed and stored in a database millions of web pages, and a few have such bytes... It didn't happen with VARCHAR yet.

Options: ReplyQuote

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.