MySQL :: Re: Clustered primary key question

New Topic

Re: Clustered primary key question

Posted by: Rick James
Date: August 22, 2015 01:37PM

I agree with your arguments for why GUIDs/UUIDs are a valuable tool in databases.

I come from the position of "If you have so many rows that performance is a problem, GUIDs are likely to be a major cause." Next... Given that you choose to sticks with GUIDs, then I try to provide compromises that let you keep GUIDs, but recover some of the performance. Let me comment on (1) clustering and (2) data size, plus give you a link for more discussion:

(1) As you mentioned, if the GUID were not clustered, it would be better. This _can_ be done with MySQL:
PRIMARY KEY(id) -- INT UNSIGNED AUTO_INCREMENT, only 4 bytes, plus
INDEX(guid)

In InnoDB, you have id clustered, and the table is added to end the "end", roughly in chronological order. Then there is a secondary index which has (guid, id) in a different BTree. This BTree is much smaller than the data's BTree, hence your data set can grow longer before anything (namely this secondary BTree) becomes too big to be cached and starts to hit the disk a lot. (I suspect that Postgres, etc, suffer this same problem; just that they make it easier to avoid the clustering.) (And, no, a HASH index is no better than a BTree for a _random_ key like this.)

(2) Often the casual user says "guid VARCHAR or CHAR(36) CHARACTER SET utf8". There are several ways to store a guid; the worst takes 108 bytes; the best takes 16. 16 gives you, again, more room for growth before hitting the performance cliff.

CHAR(36) utf8: 3*36 = 108 bytes
VARCHAR(36) (any charset): 2+36 = 38 bytes
BINARY(16) (after converting): 16 bytes

If you use only "type 1" UUIDs (which MySQL uses, but others generally don't), you can rearrange the bits into time order. See
http://mysql.rjweb.org/doc.php/uuid
This gives you the best properties of both AUTO_INCREMENT (approximately) and GUIDs.

That reference also provides Stored Functions to convert between CHAR(36) and BINARY(16).

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Written By

Posted

Clustered primary key question

Doron Grinstein

August 19, 2015 04:58PM

Re: Clustered primary key question

Barry Galbraith

August 19, 2015 05:28PM

Re: Clustered primary key question

Doron Grinstein

August 19, 2015 05:38PM

Re: Clustered primary key question

Barry Galbraith

August 19, 2015 05:56PM

Re: Clustered primary key question

Rick James

August 19, 2015 08:27PM

Re: Clustered primary key question

Doron Grinstein

August 20, 2015 08:43AM

Re: Clustered primary key question

Rick James

August 22, 2015 01:37PM

Re: Clustered primary key question

Doron Grinstein

August 23, 2015 03:33AM

Re: Clustered primary key question

Rick James

August 23, 2015 05:48PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.