Skip navigation links

MySQL Forums :: Backup :: mysqldump file/data encoding craziness


Advanced Search

mysqldump file/data encoding craziness
Posted by: Matthew Lenz ()
Date: May 23, 2012 09:40AM

We've got a product that is a multi-member + member user web app. We provide a nightly snapshot for a specific member that includes only their own data in the form of a mysqldump. This is ultimately never really used except when they want to test their own internal disaster recovery. When prepping for the most recent DR test I noticed that the application was behaving strangely and queries were taking FOREVER to return. They are complex and large queries but use indexes and are lightning fast normally. The app in their DR environment was behaving almost like no indexes were created.

We recently upgraded our own servers and this customer's DR server to debian 6.0 and as a result mysql 5.0 to mysql 5.1 (these are the stable packages supported directly by debian). I believe debian recently moved to using utf-8 as the default file character set. All our tables are myisam with latin1 as the default character set. In fact all the default mysql variables I could find for the various commands (mysql, mysqladmin, mysqldump) are latin1 except for the default system character set which is utf8.

One difference I noticed using the 'file' command is that this DR dump is UTF-8 with long lines and our normal backups are ASCII with long lines. I thought this might be the issue and attempted using the default-character-set parameter to force various character sets (ascii, latin1, etc). I also tried the set-names parameter with the same settings. This didn't resolve the issue with the queries on the DR server. We also have another app based on the same code and it's dumps always result in a 'file' command of UTF-8. It imports and works just fine. So that is out.

The solution I found that works regardless of what the 'file' command encoding result is that if I 'grep' out all the /* ###### */ import tokens from the dump it imports and the database works perfectly. Then I thought well maybe it's not the actual removing of the import tokens. Maybe it's grep processing the data in some other way. So I instead of grepping out the import tokens. I just grepped out the '--' uneeded comments from the dump. Nope. Still broken.

I think the most annoying part is that the main applicatin database that I'm actually pulling this specific member data from dumps just fine. For this DR site I create a new DB (on the same server) and dump/load ALL the table creates followed by dump/load of the data from SPECIFIC tables (skipping the uneeded log data in some of the tables all together). Then I proceed to just remove the member data I'm not interested in and then dump the entire resulting database. If i don't "grep process" this dump file. It the queries are broken.

Does anyone have ANY clue what is going on here? Sorry this is so long winded but there has to be some one out there that knows what exactly is going on here.

EDIT

I thought maybe I'd add an example of the tokens that are added to my dumps. These same tokens are what I'm having to strip to get everything to work. Hopefully it doesn't get mess up to badly by the auto formatting. I just showed the header, first table and footer of the dump.


-- MySQL dump 10.13 Distrib 5.1.61, for debian-linux-gnu (x86_64)
--
-- Host: dbprod Database: penske_backup
-- ------------------------------------------------------
-- Server version 5.1.61-0+squeeze1

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Table structure for table `XXXXXXXXXXXX`
--

/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `XXXXXXXXXXXX` (
) ENGINE=MyISAM AUTO_INCREMENT=6540 DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `XXXXXXXXXXXX`
--

LOCK TABLES `XXXXXXXXXXXX` WRITE;
/*!40000 ALTER TABLE `XXXXXXXXXXXX` DISABLE KEYS */;
/*!40000 ALTER TABLE `XXXXXXXXXXXX` ENABLE KEYS */;
UNLOCK TABLES;

.
.
.
.
.
.

/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2012-05-23 10:17:25



Edited 1 time(s). Last edit at 05/23/2012 12:15PM by Matthew Lenz.

Options: ReplyQuote


Subject Views Written By Posted
mysqldump file/data encoding craziness 2265 Matthew Lenz 05/23/2012 09:40AM
Re: mysqldump file/data encoding craziness 1044 Rick James 05/26/2012 11:09PM
Re: mysqldump file/data encoding craziness 2131 Matthew Lenz 05/27/2012 01:44PM
Re: mysqldump file/data encoding craziness 1046 Rick James 05/28/2012 09:52AM
Re: mysqldump file/data encoding craziness 1001 Matthew Lenz 05/29/2012 10:50AM
Re: mysqldump file/data encoding craziness 842 Rick James 05/30/2012 07:51PM
Re: mysqldump file/data encoding craziness 866 Matthew Lenz 05/31/2012 07:27AM
Re: mysqldump file/data encoding craziness 974 Rick James 06/12/2012 10:47PM
Re: mysqldump file/data encoding craziness 791 Matt Berlin 07/03/2012 02:46PM
Re: mysqldump file/data encoding craziness 928 Rick James 07/04/2012 04:37PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.