Re: UTF8 Chinese String Comparison
Posted by: Rick James
Date: September 27, 2009 05:44PM

I fear that the data in the table is already corrupted -- probably double-encoded.

Please do
SELECT a, LENGTH(a), CHAR_LENGTH(a), HEX(a)
   FROM test_utf
   ORDER BY LENGTH(a) ASC
   LIMIT 5
If 'a' is "one Chinese character", you should (probably) get
LENGTH=3; CHAR_LENGTH=1; and 6 hex digits.
If you get twice those, then the data is not correctly stored, probably due to double-encoding. In this case, please start over on loading the data. (Yes, there may be other ways to 'fix' the data, but I don't know them off hand--probably involves multiple ALTERs.)

Options: ReplyQuote


Subject
Views
Written By
Posted
10580
September 25, 2009 01:38PM
3684
September 26, 2009 05:48PM
3325
September 27, 2009 12:37PM
Re: UTF8 Chinese String Comparison
4554
September 27, 2009 05:44PM
3577
September 28, 2009 03:41PM
3820
September 28, 2009 03:49PM
3299
September 28, 2009 11:41PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.