MySQL Forums :: Backup :: mysqldump file/data encoding craziness
mysqldump file/data encoding craziness
Posted by: Matthew Lenz
Date: May 23, 2012 09:40AM
We've got a product that is a multi-member + member user web app. We provide a nightly snapshot for a specific member that includes only their own data in the form of a mysqldump. This is ultimately never really used except when they want to test their own internal disaster recovery. When prepping for the most recent DR test I noticed that the application was behaving strangely and queries were taking FOREVER to return. They are complex and large queries but use indexes and are lightning fast normally. The app in their DR environment was behaving almost like no indexes were created.
We recently upgraded our own servers and this customer's DR server to debian 6.0 and as a result mysql 5.0 to mysql 5.1 (these are the stable packages supported directly by debian). I believe debian recently moved to using utf-8 as the default file character set. All our tables are myisam with latin1 as the default character set. In fact all the default mysql variables I could find for the various commands (mysql, mysqladmin, mysqldump) are latin1 except for the default system character set which is utf8.
One difference I noticed using the 'file' command is that this DR dump is UTF-8 with long lines and our normal backups are ASCII with long lines. I thought this might be the issue and attempted using the default-character-set parameter to force various character sets (ascii, latin1, etc). I also tried the set-names parameter with the same settings. This didn't resolve the issue with the queries on the DR server. We also have another app based on the same code and it's dumps always result in a 'file' command of UTF-8. It imports and works just fine. So that is out.
The solution I found that works regardless of what the 'file' command encoding result is that if I 'grep' out all the /* ###### */ import tokens from the dump it imports and the database works perfectly. Then I thought well maybe it's not the actual removing of the import tokens. Maybe it's grep processing the data in some other way. So I instead of grepping out the import tokens. I just grepped out the '--' uneeded comments from the dump. Nope. Still broken.
I think the most annoying part is that the main applicatin database that I'm actually pulling this specific member data from dumps just fine. For this DR site I create a new DB (on the same server) and dump/load ALL the table creates followed by dump/load of the data from SPECIFIC tables (skipping the uneeded log data in some of the tables all together). Then I proceed to just remove the member data I'm not interested in and then dump the entire resulting database. If i don't "grep process" this dump file. It the queries are broken.
Does anyone have ANY clue what is going on here? Sorry this is so long winded but there has to be some one out there that knows what exactly is going on here.
I thought maybe I'd add an example of the tokens that are added to my dumps. These same tokens are what I'm having to strip to get everything to work. Hopefully it doesn't get mess up to badly by the auto formatting. I just showed the header, first table and footer of the dump.
-- MySQL dump 10.13 Distrib 5.1.61, for debian-linux-gnu (x86_64)
-- Host: dbprod Database: penske_backup
-- Server version 5.1.61-0+squeeze1
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;
-- Table structure for table `XXXXXXXXXXXX`
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `XXXXXXXXXXXX` (
) ENGINE=MyISAM AUTO_INCREMENT=6540 DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;
-- Dumping data for table `XXXXXXXXXXXX`
LOCK TABLES `XXXXXXXXXXXX` WRITE;
/*!40000 ALTER TABLE `XXXXXXXXXXXX` DISABLE KEYS */;
/*!40000 ALTER TABLE `XXXXXXXXXXXX` ENABLE KEYS */;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;
/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;
-- Dump completed on 2012-05-23 10:17:25
Edited 1 time(s). Last edit at 05/23/2012 12:15PM by Matthew Lenz.
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.