Re: Byte level storage seems right, display seems wrong
Posted by: Rick James
Date: February 26, 2016 01:36PM

0xC383C2A1 is "double encoding".

Here's what probably happened.

- The client had characters encoded as utf8 (good); and
- `SET NAMES latin1` lied by claiming that the client had latin1 encoding; and
- The column in the table declared `CHARACTER SET utf8` (good).

That is, the data was utf8, yet you continued to claim it was latin1 by having character_set_connection and character_set_client both saying latin1.

character_set_results is the 3rd component of `SET NAMES`; it was probably nte same as those other two. Because of this, C383C2A1 was "decoded" twice, thereby getting รก as expected.

Keep in mind that the connection settings (SET NAMES) must agree with the encoding of the bytes in the client. This was (correctly) latin1 in your first case, but incorrectly latin1 in the second case.

http://mysql.rjweb.org/doc.php/charcoll#example_of_double_encoding (and elsewhere in that blog) discusses double encoding some more.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Byte level storage seems right, display seems wrong
1047
February 26, 2016 01:36PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.