Migration encoding issues: same bytes, different text
Hi,
I apologise if this question is all over the place, because I cannot make sense of this myself.
I was running a 4.1 server with tables in "latin1" encoding. I used the admin GUI (5.0) to pull a backup and transfer it to a 5.1 server, using the default settings for the transfer. It seemed to work. The 5.1 tables are in the Windows 1252 encoding.
After using the new database for a while, we noticed a character was not displaying correctly, wherever it occurred (double backquote). I could do a search-and-replace thing, but I worry that there are other problems in the other databases that I don't yet know about.
What I would like to do is somehow repair all of these encoding problems, in case there are other, rarer characters we haven't seen yet. Here's the problem:
If I use mysqldump --default-character-set="latin1" from the 4.1 server, the data comes out fine. I wish I'd done this in the first place, but it's too late now.
I can't just dump the data from the 5.1 server, load it in to the 4.1 server and dump it again as above - it doesn't work. Here's the thing: I have two dump files, one is from the 4.1 "mysqldump --default-character-set=latin1 ..." (happy.sql) and one is from the 5.1 "mysqldump ..." (sad.sql)
I can "mysql <..." them into the 4.1 server, and look at a certain string containing this character. At the byte level, they're exactly the same no matter what file they're from. The table and column encoding are the same. When I re-dump the files using "--default-character-set=latin1", the one that came from "happy.sql" has the character in there just as it should be, I can open it in a UTF-8 capable editor and it looks fine. But if I load-and-dump "sad.sql", the character is replaced by a jumble of other characters.
So, same bytes, same encoding, different text when dumped.
Is there an extra layer of en/de-coding somewhere that I can control to re-encode the data sitting in the 5.1 database? Better yet, is there a way to repair this somehow in place?
Any advice whatsoever would be appreciated.
Thanks,
Jason