utf8_general_ci (MySQL 5.0.x vs. 5.1.36)
Hello,
I would like to migrate our application currently running on a server with SLES 10 and MySQL 5.0.26 to the new server with OpenSuSE 11.2 (64 bit) and MySQL 5.1.36. Both database servers are configured to use utf8 character set and utf8_general_ci collation, but it seems that those two versions of MySQL behave differently concerning the utf8_general_ci collation: In the old database there are for example both entries, German 'das' as well as 'daß' in a column with the unique key constraint, but restoring a database dump on the new server is leading to this error message: "ERROR 1062 (23000) at line 426: Duplicate entry 'daß for key '...'". Using mysql client leads to the same problem: For example, 'das' is already in the database. Doing a SELECT * from table WHERE column = 'daß'; on the old server will give no result and the entry 'daß' can be inserted. In contrast, doing the same select query on the new server will give one result, namely the row with 'das'. Testing all this on several other machines (OpenSuSE) with those MySQL version, namely 5.0.x and 5.1.36, leaded to the same result.
If I understand it correctly, according to the MySQL documentation (e.g.
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html and
http://dev.mysql.com/doc/refman/5.1/en/charset-unicode-sets.html) both versions should behave in the same way and the behavior of my new server should be correct, right?
Has anybody encountered the same problem? Is there a difference concerning this in those two versions or is this just a problem on my installations?
Does anybody know how to proceed best? What collation should be used for a really multilingual application?
Any help/advice/hint is very much appreciated!
Thanks a lot!