Hello all!
What is correct representation of latin1 'u umlaut' latter? I've reffered the following documents:
Latin-1:
http://www.unicode.org/charts/PDF/U0080.pdf
Cyrillic:
http://www.unicode.org/charts/PDF/U0400.pdf
All:
http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt
It is said, that 'u umlaut' corresponds to 0x00FC. But MySQL stores it as 0xC3BC as well as any UTF8-aware text editor :(
As the result, when I am trying to use Connector-J v3.1.12 I fail to see the unicode chars, as MySQL knows them, rather I see 0x00FC. As the result, cyrillic chars are treated incorrectly at all. As I've seen in mysql-connector-java-3.1.12 sources, the driver does not issue 'use names utf8' at the init stage, but rather relies on 'character_set_server' variable. I can't set this variable to utf8, as soon as I have different databases, running different encodings in MySQL server and this setting breaks the normal behaviour. Why not using 'character_set_database' ?
Thanks a lot in advance.
Importand datasource options are:
<connection-url>jdbc:mysql://localhost:3306/hbm</connection-url>
<driver-class>com.mysql.jdbc.Driver</driver-class>
<connection-property name="useUnicode">true</connection-property>
<!-- Has no effect: -->
<!-- connection-property name="characterEncoding">UTF-8</connection-property -->
The corresponding Java code is:
String name = ...;
PreparedStatement st = ...;
st.setBytes(1, name.getBytes("UTF-8"));
//st.setString(1, name) -- has the same effect