MySQL Forums
Forum List  »  InnoDB

Re: Mass Import Innodb
Posted by: Basilis Papadopoulos
Date: January 31, 2012 10:44AM

> innodb_flush_log_at_trx_commit=2
I use value 2 to reduce i/o.

> That is, >99% of reads were satisfied without hitting the disk.
95% according to innotop

>With a battery-back-write-cache? If so, then writes are virtually free.
No unfortunately

>MySQL will not use much more than 1GB.
It is not a dedicated mysql server. I run more services on it. Are there any settings on conf that could cause problems?

>Fastest loading methods...
I'm aware that there are several times faster but I need to change the IDs of the records on the fly. As a result I can't use LOAD DATA FILE or INSERT INTO ... VALUES.


By using the SQL that I posted, I copy the data from the oldDB.documentlist to the newDB.documentlist and I change the value of the id_document. The new values of the ID are stored on temp_docids table. The reordering must take place, since it improves dramatically the speed of the selects on the new DB (the documents are sorted by popularity).

Here are the 3 tables. You will notice that documentlist table schema is the same in both DBs.

CREATE TABLE `oldDB`.`documentlist` (
`id_document` int(10) unsigned NOT NULL,
`name` varchar(255) DEFAULT NULL,
`longHTMLdescr` blob,
`fileData` mediumblob,
`lastupdated` datetime DEFAULT NULL,
`status` tinyint(3) unsigned DEFAULT NULL,
`code` smallint(5) unsigned DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED;


CREATE TABLE `newDB`.`documentlist` (
`id_document` int(10) unsigned NOT NULL,
`name` varchar(255) DEFAULT NULL,
`longHTMLdescr` blob,
`fileData` mediumblob,
`lastupdated` datetime DEFAULT NULL,
`status` tinyint(3) unsigned DEFAULT NULL,
`code` smallint(5) unsigned DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED;


CREATE TABLE `newDB`.`temp_docids` (
`id_document` int(10) unsigned NOT NULL AUTO_INCREMENT,
`id_document_old` int(10) unsigned NOT NULL,
PRIMARY KEY (`id_document`) USING BTREE,
KEY `Index_id_document_old` (`id_document_old`,`id_document`),
KEY `Index_id_document_old2` (`id_document`,`id_document_old`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;


The temp_docsids stores the mapping between the new and the old ID. That's the schema of the latest version that I use. I added also 2 indexes just for testing but it does not prefer to use any of them. Forcing them has the same effect.


I have ensured the following: the Hard Disk is not used at all (10-15MB/s reads&writes), CPU utilization is very low and even when memory is available to the sytem MySQL does not use it (probably due to the innodb_buffer_pool_size Value).

Finally as I found out the mapping between old and new IDs looks like this:
new Old
1 => 5012
2 => 5014
3 => 100000
4 => 12500000
5 => 12500001
6 => 3
...

In a worst case scenario this could lead to reading 1 disk page per table row, which is slow. Nevertheless even in this case, I am able to copy at the same time huge files to the hard disk extremely fast (200-500MB/s). This shows that the problem is not on the HD but rather on MySQL configuration/query/or similar.


Any clues at all?

Options: ReplyQuote


Subject
Views
Written By
Posted
3466
January 29, 2012 07:10AM
1052
January 30, 2012 09:07AM
1048
January 30, 2012 10:24AM
1010
January 31, 2012 07:06AM
964
January 31, 2012 02:51AM
888
January 31, 2012 03:21AM
1029
January 31, 2012 10:20AM
Re: Mass Import Innodb
1257
January 31, 2012 10:44AM
963
February 01, 2012 02:22AM
1056
February 01, 2012 04:33AM
999
February 06, 2012 05:59AM
1139
February 07, 2012 10:38AM
1004
February 07, 2012 11:15AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.