Re: Byte level storage seems right, display seems wrong
Posted by: Andy Theuninck
Date: February 26, 2016 02:56PM

character_set_connection, character_set_client, and character_set_results were all latin1 in all queries.

If the client sends the correct latin1 encoding, 0XE1, MySQL saves that value to the table with the correct unicode encoding 0XC3A1. This makes perfect sense. The client told the server what character set it was using, did in fact use that character set, and the server was able to translate it to the table's encoding.

When I query this correctly saved value using the mysql cli client, still with all the character_set_* settings above set to latin1, the display that comes back is a gray square instead of an á.

On the other hand, if the client sends the utf8 encoding, 0XC3A1, MySQL saves that value as 0XC383C2A1. This also makes perfect sense. The client lied about what encoding it was sending so the server ended up with the double-encoded value.

When I query this incorrectly/double-encoded value using the mysql cli client, still with all the character_set_* settings above set to latin1, the display that comes back is an á.

In other words, if character data is submitted and stored correctly and a user tries to view that data with the cli client the non-ASCII characters look broken. If the character data is submitted and stored incorrect and a user tries to view that data with the cli client the non-ASCII characters look correct. This is difficult to explain to people who don't want to talk about hex codes for character encodings - if it looks right on screen that means it's actually wrong.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Byte level storage seems right, display seems wrong
1025
February 26, 2016 02:56PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.