Re: Problem with arabic charset
Posted by:
Rick James
Date: February 10, 2011 10:05PM
2-byte utf8 codes starting with C3 are for Western European characters, not Arabic. So, I deduce that you have "double encoding" as discussed in the document. If this is supposed to be the same string, then I suspect triple encoding! --
D8A3E2808DD8A3E2809CD8A3C2A320D8A3E280A1D8A3D88C
11 | 6 -- yes it looked like 3 chars, a space (20), then 2 more chars (11 bytes, 6 chars).
11 | 11 -- BLOB has no distinction between "character" and "byte".
Try these alters:
C39EC393C3A320C387C3A1 -> tinyblob -> utf8 -> latin1 -> tinyblob -> utf8
(Sorry, my brain is about to explode trying to figure how to fix that string.)
utf8_general_ci and utf8_unicode_ci -- those are collations; not relevant to the issue of encoding.
Subject
Views
Written By
Posted
4789
January 25, 2011 07:17PM
1645
January 27, 2011 02:57PM
1780
January 29, 2011 11:57PM
2691
February 08, 2011 11:08PM
Re: Problem with arabic charset
2465
February 10, 2011 10:05PM
1548
February 10, 2011 10:09PM
1552
February 10, 2011 10:41PM
2555
February 12, 2011 08:03PM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.