MySQL Forums
Forum List  »  InnoDB

Performance Problems for large inserts into INNODB table via LOAD DATA
Posted by: Michael Vitale
Date: December 27, 2010 09:59AM

subject: Performance Problems for large inserts into INNODB table via LOAD DATA
I am trying to insert about 300 million records into one INNODB table via LOAD DATA calls, where each file may have 1-30 records in it. I am trying to speed up performance and here is where I'm at: My inserts per second take longer and longer after the insert job has been running for awhile. It decreases from about 2000/sec to about 400/sec.

I am looking for optimization recommendations via my my.cnf file if it becomes apparent to anybody.

Operating system: Unix Centos 5.5
CPUs: 8
Memory: 64G
MySQL version: 5.1.48, using latest INNODB plugin.
Dedicated MySQL Server box.

Program Insert Description:
Using a Python/MySQLdb script that makes multiple LOAD DATA calls to insert records into 2 tables. The program is run from the MySQL server and the import files are located there as well. This is a dedicated MySQL server just for this sole process, so no concurrent activity. The program does random commits to force flushing of buffers.

DDL description:
The tables are defined only with primary keys, and the insert order is based on an ascending primary key value, so we are inserting the records in primary key order. There are no other indexes defined on the table. There is one table per tablespace.

Insert case 1:
When I ran this with one large datafile in one tablespace with no partitioning, I was getting very good inserts per second (2000), but it started to hang after about 5 million rows were inserted. Also, I noticed that commands like optimize took too long to run, so I changed the DDL as noted in the next case.

Insert case 2:
Created 10,35G datafiles in one tablespace (the biggest table), and partitioned the table based on the primary key.
My Inserts per second was about the same as before but after a while the inserts per second started to take longer.

Here is my my.cnf settings:
[mysqld]
datadir=/vol0/opt/mysql
socket=/var/lib/mysql/mysql.sock
innodb_data_file_path=ibdata1:1000M;ibdata2:100M:autoextend
########### use 38G for bufferpool
innodb_buffer_pool_size=40802189312
innodb_additional_mem_pool_size=500M

# new log file size of 500M
innodb_log_file_size=524288000
innodb_log_files_in_group=4

#20M log buffer size
innodb_log_buffer_size=20971520

innodb_flush_method=O_DSYNC
innodb_flush_log_at_trx_commit=1

innodb_thread_concurrency=16
interactive_timeout=28800
net_write_timeout=60
wait_timeout=28800
connect_timeout=60
net_read_timeout=60

default-storage-engine=InnoDB
innodb_file_per_table=1

# enabled new INNODB plug in:
ignore_builtin_innodb
plugin-load=innodb=ha_innodb_plugin.so;innodb_trx=ha_innodb_plugin.so;innodb_locks=ha_innodb_plugin.so;innodb_lock_waits=ha_innodb_plugin.so;innodb_cmp=ha_innodb_plugin.so;innodb_cmp_reset=ha_innodb_plugin.so;innodb_cmpmem=ha_innodb_plugin.so;innodb_cmpmem_reset=ha_innodb_plugin.so
innodb_file_format=barracuda
innodb_strict_mode=1
innodb_use_sys_malloc=1

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid


Here is my database DDL:
CREATE DATABASE discoverydb;
CREATE TABLESPACE TSDiscovery ADD DATAFILE 'TSDiscovery01' INITIAL_SIZE = 5368709120 EXTENT_SIZE = 134217728 ENGINE=INNODB;

CREATE TABLESPACE TSTickets ADD DATAFILE 'TSTickets01' INITIAL_SIZE = 34359738368 EXTENT_SIZE = 2147483648 ENGINE=INNODB;
ALTER TABLESPACE TSTickets ADD DATAFILE 'TSTickets02' INITIAL_SIZE = 34359738368 ENGINE=INNODB;
ALTER TABLESPACE TSTickets ADD DATAFILE 'TSTickets03' INITIAL_SIZE = 34359738368 ENGINE=INNODB;
ALTER TABLESPACE TSTickets ADD DATAFILE 'TSTickets04' INITIAL_SIZE = 34359738368 ENGINE=INNODB;
ALTER TABLESPACE TSTickets ADD DATAFILE 'TSTickets05' INITIAL_SIZE = 34359738368 ENGINE=INNODB;
ALTER TABLESPACE TSTickets ADD DATAFILE 'TSTickets06' INITIAL_SIZE = 34359738368 ENGINE=INNODB;
ALTER TABLESPACE TSTickets ADD DATAFILE 'TSTickets07' INITIAL_SIZE = 34359738368 ENGINE=INNODB;
ALTER TABLESPACE TSTickets ADD DATAFILE 'TSTickets08' INITIAL_SIZE = 34359738368 ENGINE=INNODB;
ALTER TABLESPACE TSTickets ADD DATAFILE 'TSTickets09' INITIAL_SIZE = 34359738368 ENGINE=INNODB;
ALTER TABLESPACE TSTickets ADD DATAFILE 'TSTickets10' INITIAL_SIZE = 34359738368 ENGINE=INNODB;

CREATE TABLESPACE TSTicketAtt ADD DATAFILE 'TSTicketAtt01' INITIAL_SIZE = 34359738368 EXTENT_SIZE = 2147483648 ENGINE=INNODB;
ALTER TABLESPACE TSTicketAtt ADD DATAFILE 'TSTicketAtt02' INITIAL_SIZE = 34359738368 ENGINE=INNODB;

CREATE TABLE discoverydb.TicketMetrics(
id INT NOT NULL AUTO_INCREMENT,
GENStartYear INT NULL,
GENEndYear INT NULL,
GENLastRunStart DATETIME NULL,
GENLastRunEnd DATETIME NULL,
GENTotalFilesCreated INT NULL,
GENTktFilesCreated INT NULL,
GENAttFilesCreated INT NULL,
GENMessagesCreated INT NULL,
GENAttachmentsCreated INT NULL,
GENMBytesUsed INT NULL,
GENRunMins INT NULL,
GENErrors VARCHAR(10000) NULL,
GENRunInfo VARCHAR(10000) NULL,
IMPStartYear INT NULL,
IMPEndYear INT NULL,
IMPLastRunStart DATETIME NULL,
IMPLastRunEnd DATETIME NULL,
IMPTotalFilesProcessed INT NULL,
IMPTktFilesProcessed INT NULL,
IMPAttFilesProcessed INT NULL,
IMPRowsInserted INT NULL,
IMPMBytesUsed INT NULL,
IMPRunMins INT NULL,
IMPRunInfo VARCHAR(10000) NULL,
IMPWarnings INT NULL,
IMPErrors INT NULL,
IMPErrorInfo VARCHAR(10000) NULL,
PRIMARY KEY (id)) TABLESPACE TSDiscovery Engine=InnoDB;

CREATE TABLE discoverydb.Tickets(
TKT_ID INT NOT NULL,
aYearMonthDayDir INT NOT NULL,
RUN_ID INT NOT NULL,
LastUpdate TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
XMTSTimestamp BIGINT NULL,
FileDateTime DATETIME NOT NULL,
Attachments TINYINT NOT NULL,
MessageID VARCHAR(500) NULL,
XMTSTicket VARCHAR(500) NOT NULL,
Maildate VARCHAR(60) NULL,
Mailsubject VARCHAR(500) NULL,
Mailfrom VARCHAR(500) NULL,
MailFrom2 VARCHAR(500) Null,
MailTo VARCHAR(500) NULL,
Mailcc VARCHAR(500) NULL,
Mailreplyto VARCHAR(500) NULL,
Mailorg VARCHAR(500) NULL,
BodyType VARCHAR(100) NOT NULL,
BodyLength INT NOT NULL,
MailExportNotes VARCHAR(5000) NULL,
MailImportNotes VARCHAR(5000) NULL,
Mailbody MEDIUMTEXT NOT NULL,
PRIMARY KEY (TKT_ID,aYearMonthDayDir)) TABLESPACE discovery Engine=InnoDB CHARACTER SET utf8 COLLATE utf8_general_ci
PARTITION BY RANGE (aYearMonthDayDir)
(
PARTITION P1989_1995 VALUES LESS THAN (19960000),
PARTITION P1996 VALUES LESS THAN (19970000),
PARTITION P1997 VALUES LESS THAN (19980000),
PARTITION P1998 VALUES LESS THAN (19990000),
PARTITION P1999 VALUES LESS THAN (20000000),
PARTITION P2000 VALUES LESS THAN (20010000),
PARTITION P2001 VALUES LESS THAN (20020000),
PARTITION P2002 VALUES LESS THAN (20030000),
PARTITION P2003 VALUES LESS THAN (20040000),
PARTITION P2004 VALUES LESS THAN (20050000),
PARTITION P2005 VALUES LESS THAN (20060000),
PARTITION P2006 VALUES LESS THAN (20070000),
PARTITION P2007 VALUES LESS THAN (20080000),
PARTITION P2008 VALUES LESS THAN (20090000),
PARTITION P2009 VALUES LESS THAN (20100000),
PARTITION P2010UP VALUES LESS THAN (20200000)
);

CREATE TABLE discoverydb.TicketAttachments (
FILE_ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
LastUpdate TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
TKT_ID INT NOT NULL,
seqno INT NOT NULL,
XMTSTicket VARCHAR(30) NOT NULL,
MessageID VARCHAR(200) NULL,
FileType VARCHAR(100) NOT NULL,
FileSize INT NOT NULL,
Name VARCHAR(300) NULL,
create_date VARCHAR(40) NULL,
mod_date VARCHAR(40) NULL,
read_date VARCHAR(40) NULL,
Content MEDIUMBLOB NOT NULL)
TABLESPACE TSTicketAtt Engine=InnoDB CHARACTER SET utf8 COLLATE utf8_general_ci;

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.