MySQL Forums
Forum List  »  Performance

Re: 350+ million rows -- is this as fast as it can get? (too slow at the moment!)
Posted by: Rick James
Date: February 11, 2009 01:46AM

You fell over the cliff...

As long as the table and indexes are small enough to fit in cache (in RAM), adding a new row takes a constant amount of time (approx.)

Once your table is bigger, disk hits are required; and the frequency increases. Let's look at some cases...

Simple AUTO_INCREMENT PRIMARY KEY, no secondary keys. Well this won't slow down -- you are always appending to the end of the table and the PK.

Let's add a secondary key that is "random" (a name, or whatever, that won't be ordered as you do the inserts). Now, for each insert, it has to reach for the index block where it needs to put the pointer to that record. If the index can't be cached fully, it may have to do a disk read (and eventually a write).

If the index is 5% bigger than cache, you might have to do such a read 5% of the time. Get the picture? If the index is 5 times as big as the cache, it will usually have to hit the disk for a random block. (~80% of the time)

More secondary keys? More disk hits (if they are random).

I don't think it is actually "exponential", but it may feel like it.

Techniques that help:
1. Shrink the data
2. Shrink the data
3. Shrink the data
Ok, seriously:
* MyISAM is 2x-3x smaller.
* Use INT (4 bytes) instead of BIGINT (8 bytes); or even MEDIUMINT (3), etc.
* Compress blobs and big text fields
* Normalize
* In InnoDB, with 4 secondary indexes, remember that the PRIMARY KEY is in every secondary index -- so keep the PK small. (MyISAM has a byte offset or record number, so this does not apply.)
* When bulk loading, sort the data (of course, you can't sort it in all the ways you have indexes)
* Turn off indexing while inserting; turn back on afterwards. It will be a lengthy "Reindexing via Repair", but might still be faster.
* (InnoDB) Turn off Foreign key checks during a bulk load.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.