Re: UTF8 Chinese String Comparison
Posted by: Rick James
Date: September 27, 2009 05:44PM

I fear that the data in the table is already corrupted -- probably double-encoded.

Please do
SELECT a, LENGTH(a), CHAR_LENGTH(a), HEX(a)
   FROM test_utf
   ORDER BY LENGTH(a) ASC
   LIMIT 5
If 'a' is "one Chinese character", you should (probably) get
LENGTH=3; CHAR_LENGTH=1; and 6 hex digits.
If you get twice those, then the data is not correctly stored, probably due to double-encoding. In this case, please start over on loading the data. (Yes, there may be other ways to 'fix' the data, but I don't know them off hand--probably involves multiple ALTERs.)

Options: ReplyQuote


Subject
Views
Written By
Posted
10719
September 25, 2009 01:38PM
3740
September 26, 2009 05:48PM
3358
September 27, 2009 12:37PM
Re: UTF8 Chinese String Comparison
4589
September 27, 2009 05:44PM
3607
September 28, 2009 03:41PM
3859
September 28, 2009 03:49PM
3333
September 28, 2009 11:41PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.