MySQL Forums :: Character Sets, Collation, Unicode :: Fixing Double Enocoding With mysqldump


Advanced Search

Fixing Double Enocoding With mysqldump
Posted by: Peter Berry ()
Date: October 20, 2011 04:59PM

Hi, I've just diagnosed my problem as "Double Encoding" as described by Rick James' incredibly helpful page at http://mysql.rjweb.org/doc.php/charcoll.

Indeed I have UTF8 data going in, a connection (erroneously) set to Latin1, and a database and tables set to UTF8. One 2 byte UTF8 character in ends up encoded as 4 bytes in my tables.

My question is about fixing it. It seems like by far the easiest thing to do would be:

1. mysqldump --default-character-set=latin1 .... my_database > my_database_latin1.sql
2. Edit my_database_latin1.sql to set NAMES=utf8 at the top.
3. mysql ... < mydatabase.sql

The dump should convert the 4 bytes which it thinks are 2 utf8 characters back down to 2 bytes (2 latin1 characters). The file should actually contain UTF8, though it thinks it's latin1. So change the encoding of the connection set at the top of the file to UTF8, and import this data back into the same database.

Since Rick doesn't quite mention this fix, I'm assuming I'm missing something. A quick test does look ok so far.

Am I missing something?

Thanks for the help - Peter

Options: ReplyQuote


Subject Views Written By Posted
Fixing Double Enocoding With mysqldump 3876 Peter Berry 10/20/2011 04:59PM
Re: Fixing Double Enocoding With mysqldump 1809 Rick James 10/21/2011 07:23PM
Re: Fixing Double Enocoding With mysqldump 1717 Peter Berry 10/23/2011 03:07PM
Re: Fixing Double Enocoding With mysqldump 2064 Peter Berry 10/26/2011 12:15PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.