MySQL Forums
Forum List  »  Newbie

Re: Can not get proper utf8 text
Posted by: Robert Lee
Date: June 09, 2011 10:28AM

While waiting for a pure MySQL solution, I wrote some C# code to clear this double-encoding mess. It works on raw string data from the remote database without calling any MySQL string conversion function. Hopefully this could help other C# users facing the similar problem.

C# code:
. Debug.Print("raw string (unicode): " + datatbl.Rows[j]);
.
. byte[] barr = Encoding.Convert(Encoding.Unicode, Encoding.UTF8, Encoding.Unicode.GetBytes(datatbl.Rows[j].ToString()));
. Debug.Print(" > utf8 hex value: " + BitConverter.ToString(barr));
.
. barr = Encoding.Convert(Encoding.UTF8, Encoding.GetEncoding(1252), barr);
. Debug.Print(" > latin1 hex value: " + BitConverter.ToString(barr));
.
. string str = Encoding.Unicode.GetString(Encoding.Convert(Encoding.UTF8, Encoding.Unicode, barr));
. Debug.Print(" > unicode string: " + str);

output:
.raw string (unicode): 内間
. > utf8 hex value: C3-A5-E2-80-A0-E2-80-A6-C3-A9-E2-80-93-E2-80-9C
. > latin1 hex value: E5-86-85-E9-96-93
. > unicode string: 内間

Note: When converting utf8 to latin-1, ISO8859 won't work. Must use Windows Code Page 1252.


Bob



Edited 1 time(s). Last edit at 06/09/2011 11:09AM by Robert Lee.

Options: ReplyQuote


Subject
Written By
Posted
Re: Can not get proper utf8 text
June 09, 2011 10:28AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.