Re: PK of 2 columns - to reduce disk seek operations
Posted by: Rick James
Date: December 18, 2013 02:25PM

I suggest this:

PRIMARY KEY(ClusterID, TaskID),
INDEX(TaskID)

* The primary key is unique because TaskID is part of it.
* The AUTO_INCREMENT will work since TaskID is the first column in some index.
* You do lose the UNIQUEness constraint on TaskID, but if you never do arbitrary inserts with TaskID, that is not an issue.
* By not having TaskID "UNIQUE", you save some performance during INSERTs.
* As you say, you get the 'clustering' benefits on the table.

<fact>
The fill factor in InnoDB is a constant 15/16.
</fact>
<opinions>
* "Fill factor" is over-sold in the lore.
* 15/16 is nearly as good as the 'perfect' fill factor.
</opinions>

Keep in mind that after a 'block split', each of the resulting blocks is about 50% full. That is, there will be a lot of room for further INSERTs before the next split.

If there are not many different values for ClusterID, have you suitably typed it? For example, SMALLINT UNSIGNED allows 64K distinct values and takes only 2 bytes (versus 4 bytes for INT).

Also, be sure to set innodb_buffer_pool_size appropriately. See
http://mysql.rjweb.org/doc.php/memory

If you are using MyISAM, much of what I said does not apply. You should switch to InnoDB to get the clustering by PRIMARY KEY.

Options: ReplyQuote


Subject
Written By
Posted
Re: PK of 2 columns - to reduce disk seek operations
December 18, 2013 02:25PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.