Re: Inverse Lookup in Key-Value Storage: avoid disk seeks
In the database world, we call this mapping "many-to-one":
A->1
B->2
C->1
D->3
It sounds like the table would have a UNIQUE index on A,B,... and a non-unique INDEX on 1,2,...
Think of an index as another table. And every table or index has an ordering. INDEXes (including UNIQUE) are stored in the specified order. InnoDB data is stored in the PRIMARY KEY's order. MyISAM's data is (usually) simply stored in the order that you inserted the rows -- this is often a useless order.
> for each entry MySQL needs to access the disk in order to load the corresponding KEY
Not for InnoDB. When you say
INDEX(VALUE)
you implicitly get a pseudo table with two columns: VALUE, KEY. And it is sorted by VALUE, then KEY.
Likewise, in MyISAM, if you explicitly say
INDEX(VALUE, KEY)
Then the SELECT KEY FROM map WHERE VALUE = ... drills down the BTree of the index, and finds the KEY(s) right there. Note: In your case, it will find many KEYs -- This it can easily do by (logically) doing:
1. Drill down the BTree to find the _first_ row containing the desired VALUE.
2. Scan forward until VALUE changes, collecting all the KEYs found.
Step 1 is 3-4 hops for a million-row table.
Step 2 is very efficient because the items are adjacent. That is it might gather 100 KEYs before it has to move on to the next 'block'.