MySQL Forums
Forum List  »  Full-Text Search

Re: Arabic text in mysql Varchar row
Posted by: Rick James
Date: September 10, 2011 03:07PM

C3993F20C398C2A7C3993FC398C2AEC398C2B720
Yuck! It is worse than just double-encoding.

First, let's look at the first part of that, split based on utf8, then converting parts back:
        C399 3F 20 C398 C2A7 C399 3F C398 C2AE C398 C2B7 20
        D9   3F 20 D8   A7   D9   3F D8   AE   D8   B7   20
Notes:  6....3. 4.                3.           5........ 4.
Notes:
1. C399/C398 -- came from D9/D8
2. Arabic characters, in utf8, begin with D9 or D8 (at least)
3. 3F is "?", which is often used when an illegal encoding is being converted.
4. 20 is a space -- No problem with this.
5. C398 C2B7 -- D8 B7 -- D8B7 is utf8 for "ARABIC LETTER TAH"; C398C2B7 is the 'double encoding' of that.
6. Some pair of bytes D9xx (I don't know what xx) failed in converting to C399yyyy. Instead of yyyy, you got 3F ('?'). Then coming back, the C399 (U-grave) went to D9, but '?' stayed '?'. That lead to an illegal utf8 code 'D93F', hence the diamond with the '?'.

I'm afraid that the data in your table is corrupted beyond recovery. Start over in inserting the data. And be sure to check the HEX as soon as you have some Arabic loaded. The cursory check of the hex: C398 and C399 are bad; D8 and D9 are good.

Options: ReplyQuote


Subject
Views
Written By
Posted
14481
September 05, 2011 04:23PM
3719
September 07, 2011 08:01PM
5025
September 08, 2011 04:45PM
2968
September 09, 2011 06:46PM
4700
September 11, 2011 08:03AM
2782
September 12, 2011 11:30AM
3426
September 12, 2011 06:16PM
2592
September 12, 2011 06:17PM
2723
September 13, 2011 09:44PM
2996
September 17, 2011 05:53AM
2582
September 18, 2011 01:45PM
2607
September 19, 2011 04:10PM
2718
September 20, 2011 08:19PM
3076
September 25, 2011 12:09PM
2653
September 28, 2011 09:53PM
2851
September 29, 2011 02:44PM
3536
September 30, 2011 10:14AM
Re: Arabic text in mysql Varchar row
3441
September 10, 2011 03:07PM
3494
September 11, 2011 08:01AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.