double-encoding of Arabic text
Posted by:
Rick James
Date: August 25, 2012 01:18AM
http://bugs.mysql.com/bug.php?id=30277
FULLTEXT is designed for English, I suspect it has problems with other languages and other character sets.
Sorry, the text was double-encoded as it was stored. This will render any collation 'incorrect'. My document somewhat discusses how to fix the data (assuming it is not beyond repair).
SELECT HEX(CONVERT(CONVERT(UNHEX( 'C398C2AF') USING utf8) USING latin1));
--> D8AF, which is the utf8 for Arabic DAL.
SELECT CONVERT(CONVERT(UNHEX( 'C398C2AF') USING utf8) USING latin1);
--> that character.
Utf8 encodings for Basic Arabic characters:
http://lcweb2.loc.gov/diglib/codetables/33.html
Subject
Views
Written By
Posted
4003
August 20, 2012 05:56AM
2305
August 22, 2012 10:56PM
2147
August 23, 2012 12:28AM
2328
August 23, 2012 07:52PM
2304
August 24, 2012 03:58AM
double-encoding of Arabic text
3238
August 25, 2012 01:18AM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.