MySQL Forums
Forum List  »  Newbie

Re: Fastest way to check for an image in a table
Posted by: Rick James
Date: September 25, 2010 08:15PM

String searches are not slow.

Make a parallel table. OK, so you are already doing that. The table contains meta information about the image, including the hash.

The problem comes when the hash index on the meta table is too big to be cached in RAM. Hashes are 'random'. A check to see if the hash is already in the table will need to do a disk access. That is, with under, say, 50M images, your checks are essentially CPU-bound, and you can probably check more than 1000 per second. With a billion images, the checks will slow down to something like 100 per second.

MD5 is a 128-bit hash, good enough for anything short of identifying each distinct atom in the universe. The hex version could be put into a BINARY(32) field, not VARCHAR(32). Note I say BINARY since you don't need any collation. And you don't need VAR since it is a constant length. Or it could be put into BINARY(16). This would shrink the index size.

How many images do you have?

Options: ReplyQuote


Subject
Written By
Posted
Re: Fastest way to check for an image in a table
September 25, 2010 08:15PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.