MySQL Forums
Forum List  »  InnoDB

Re: random-access I/O
Posted by: Basilis Papadopoulos
Date: February 01, 2012 05:19AM

>Do you have blobs? Can you compress them? That will save a significant amount of I/O (and space).
I have 2 blobs in the table, one medium and one small. I use compression and indeed it REALLY helps I/O and space. Unfortunately though I have read several articles that mention that by compressing the tables you increase the insertion time.


>It is unclear whether you should sort before inserting or insert randomly. Need to see the details.
I sort before inserting because this heavily affects the speed and reduces I/O when the table gets bigger. In the past I used to insert randomly and after a while the speed of INSERTs decreased a lot. Also the empty space of the innodb table files heavily increased! That's why now I insert the records sorted by the new ID. As you'll see below this leads to "random" reads from the old table, which affects the speed.


Here is the table schema:

CREATE TABLE `documentlist` ( 
`id_document` int(10) unsigned NOT NULL, 
`name` varchar(255) DEFAULT NULL, 
`longHTMLdescr` blob, 
`fileData` mediumblob, 
`lastupdated` datetime DEFAULT NULL, 
`status` tinyint(3) unsigned DEFAULT NULL, 
`code` smallint(5) unsigned DEFAULT NULL, 
PRIMARY KEY (`id`) 
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED;


I was about to provide you the INSERT..SELECT statement but after testing I have found that a particular segment of the SELECT part of the statement is the slow one. Here is a simplified version of the statement that is executed slowly (40 seconds). The whole INSERT...SELECT is executed in 41 secs so that's the problematic part:

SELECT `id_document`,`name` 
FROM `oldDB`.`documentlist` 
WHERE `id_document` in (15371963,14864105,13406499,13230248,16477007,10207882,14408890,.......);

inside the in() there are 1000 IDs. I am forced to select them with this particular order in order to do the inserts on the new table in a sorted manner. Selecting them in a different way is not an option (I've heavily tested this). I am sure that the above records are stored in different disk pages and this leads to slow execution. Still though HD is not used at all and it is able to perform reads at 500MB/s for other tasks at the same time. That's why I believe that it is a problem of the memory or buffers. Also CPU usage is very low.

Here is the explain of the above query. As you see the PRIMARY key is used normally:
1, 'SIMPLE', 'documentlist', 'range', 'PRIMARY', 'PRIMARY', '4', '', 1000, 'Using where'

Here are the buffers:

'bulk_insert_buffer_size', '4096'
'innodb_buffer_pool_instances', '1'
'innodb_buffer_pool_size', '1074790400'
'innodb_change_buffering', 'all'
'innodb_log_buffer_size', '10485760'
'join_buffer_size', '1048576'
'key_buffer_size', '134217728'
'myisam_sort_buffer_size', '1048576'
'net_buffer_length', '65536'
'preload_buffer_size', '32768'
'read_buffer_size', '2097152'
'read_rnd_buffer_size', '2097152'
'sort_buffer_size', '1048576'
'sql_buffer_result', 'OFF'

I really don't understand what's wrong. If it is not a problem on the buffers (most likely), then it's probably a bug?

Options: ReplyQuote


Subject
Views
Written By
Posted
3579
January 31, 2012 03:50AM
1082
January 31, 2012 08:25PM
Re: random-access I/O
1021
February 01, 2012 05:19AM
986
February 03, 2012 01:41AM
1243
February 03, 2012 02:39AM
1069
February 03, 2012 11:08PM
1014
February 04, 2012 08:54AM
1536
February 05, 2012 12:56PM
1030
February 08, 2012 07:22AM
1493
February 08, 2012 11:17PM
927
February 09, 2012 05:30AM
1016
February 09, 2012 01:48AM
846
February 09, 2012 01:58AM
1060
February 09, 2012 10:52AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.