MySQL Forums
Forum List  »  Optimizer & Parser

Re: Very Large Non-Dynamic Table for Lookups
Posted by: Rick James
Date: October 25, 2009 12:32PM

Look at it this way. Let's say you have 1GB for innodb_buffer_pool_size (a practical limit on a 2GB machine) and the data is 100GB. Now, let's say you want to fetch a _random_ SSN. I do mean random; you can't predict what value it will have. What are the odds of that one being in cache at the moment? 1GB/100GB = 1%. That is, fetch 100 _random_ SSNs, 99 will hit the disk, 1 will be in ram.

Options:
* Faster disk -- faster drives help a little; RAID helps some with simultaneous accesses; SSDs might give you 10x
* More RAM -- 32GB of RAM (costly!) might get you closer to 300 fetches/sec.
* Shrink the data -- storing the SSN as an INT: 4 bytes instead of 10. Only a slight help.

Sort the SSNs? If you are about to fetch 10 million ssns, sorting them first will help some. The first 3 digits of my SSN says where I lived when I got mine. That does not feel like a useful sort order.

The SSN is very sensitive information. Have you heard of Identity Theft? Do you have names connected to the SSNs? If you do, expect a call from the FBI. OTOH, if you have legitimate use for the SSNs, then you need a serious lesson in security.

I'm happy to talk about big tables. I won't apologize for scaring you about SSNs.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Very Large Non-Dynamic Table for Lookups
2094
October 25, 2009 12:32PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.