Cross language unicode issues
Posted by: Paul Dlug
Date: January 09, 2007 09:25PM

I seem to be having some issues with the MySQL JDBC driver when dealing with records written/read by other languages, in a nutshell the problem is:

1) Write a utf8 string from perl via DBD::mysql to a table, reading back the string it is well formed utf8 (after calling decode_utf8 on it to mark it in perl)
2) Read back the string with java -- string is mangled

The problem occurs in reverse as well. Java does read back the strings as valid UTF8 if it writes them itself but they are clearly mangled in mysql command line client or another language such as perl or ruby.

For example, the first record below was written with perl, the second with java:

mysql> select * from articles;
| title |
| Mössbauer Effect |
| M??ssbauer Effect |

Clearly the first record is correct, validly formed utf8 data. I have ruled this out as a problem with java utf8 string handling itself because if I read in a file containing utf8 data and write it out to another I am able to diff it with no changes and read it with perl and ruby as perfectly well formed utf8 data. It appears to only be a problem writing to a MySQL DB via JDBC.

All of my tables are InnoDB with character set utf8, collation utf8_bin. Defaults on the server are the same. I have tested with server versions 5.0.24 and 5.1.14 both with MySQL Connector/J version 5.0.4. The connect string I am using with JDBC has the options: useUnicode=true&characterEncoding=UTF-8&characterSetResults=UTF-8

What am I doing wrong here? Is there a bug with Connector/J?

I can provide code samples and other information as needed.

Options: ReplyQuote

Written By
Cross language unicode issues
January 09, 2007 09:25PM
January 24, 2007 06:33AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.