Re: Problem with arabic charset
Posted by: Rick James
Date: February 10, 2011 10:05PM

2-byte utf8 codes starting with C3 are for Western European characters, not Arabic. So, I deduce that you have "double encoding" as discussed in the document. If this is supposed to be the same string, then I suspect triple encoding! --
D8A3E2808DD8A3E2809CD8A3C2A320D8A3E280A1D8A3D88C

11 | 6 -- yes it looked like 3 chars, a space (20), then 2 more chars (11 bytes, 6 chars).
11 | 11 -- BLOB has no distinction between "character" and "byte".

Try these alters:
C39EC393C3A320C387C3A1 -> tinyblob -> utf8 -> latin1 -> tinyblob -> utf8
(Sorry, my brain is about to explode trying to figure how to fix that string.)

utf8_general_ci and utf8_unicode_ci -- those are collations; not relevant to the issue of encoding.

Options: ReplyQuote


Subject
Views
Written By
Posted
4711
January 25, 2011 07:17PM
1608
January 27, 2011 02:57PM
1727
January 29, 2011 11:57PM
2643
February 08, 2011 11:08PM
Re: Problem with arabic charset
2415
February 10, 2011 10:05PM
1505
February 10, 2011 10:09PM
1505
February 10, 2011 10:41PM
2498
February 12, 2011 08:03PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.