Help with partially unique index
Posted by:
El T
Date: June 22, 2016 04:44AM
From the thread title, everyone will probably immediately consider this to be a "wrong question". I'm sure it is, but I'd really appreciate help formulating the right one. :)
I need advice from people with a good knowledge of how indexing works on InnoDB. I have a table of twelve columns, a small portion of which form a unique combination and by which it is always accessed. The below is a very mildly simplified copy of what I have:
CREATE TABLE `user_category_info` (
`account_id` bigint(20) unsigned DEFAULT NULL,
`user_id` bigint(20) unsigned NOT NULL,
`category_id` smallint(5) unsigned NOT NULL,
`status` tinyint(3) unsigned NOT NULL,
`score` int(10) unsigned NOT NULL,
`somevalueA` int(10) unsigned DEFAULT NULL,
`somevalueB` int(10) unsigned DEFAULT NULL,
`somevalueC` int(10) unsigned DEFAULT NULL,
`somevalueD` int(10) unsigned DEFAULT NULL,
`somevalueE` int(10) unsigned DEFAULT NULL,
`somevalueF` int(10) unsigned DEFAULT NULL,
`somevalueG` int(10) unsigned DEFAULT NULL,
UNIQUE KEY `udx_acc_user_cat_stat_comp` (`account_id`,`user_id`,`category_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
As you can see, the account_id, user_id and category_id are a unique combination and the table is nearly always accessed by all three whatever other criteria might also be applied. There is no id field for two reasons - with the unique combination, none is needed and the table in practice is partitioned by the account_id field because the table is both read and written to in roughly equal amounts at quite a high volume and large updates or reads are always limited to a specific account id. Ergo, it makes a natural partition break. I could still have an auto-incrementing id column across all partitions, but it seems to be only a negative in my case.
Anyway, the above works very well BUT many of my queries can be answered simply by checking the values of the status and score columns. And even those that don't usually include the status and score as part of their search criteria which would allow the easy discounting of many otherwise matching records. (E.g. WHERE status = 1). That to me seems like a strong suggestion to use an index that includes the status and the score columns as well. With such an index, I can resolve large numbers of queries just by checking the index and not even touching the records themselves. And the even rest should run faster too due to easy exclusion of potential matches.
However, this is where I need help. If I were to extend the index to include the status and score columns, then I must make it non-unique because only the first three columns comprise a unique combination. Status and score can have varying values. I.e. I could have the following
account_id | user_id | category_id | status | score
12000 | 18320 | 3 | 1 | 98
or
account_id | user_id | category_id | status | score
12000 | 18320 | 3 | 0 | 98
but I couldn't have:
account_id | user_id | category_id | status | score
12000 | 18320 | 3 | 1 | 98
12000 | 18320 | 3 | 0 | 98
So as that is not an option, I could just add a second index on status and score. But that would almost certainly be a heavy hit on performance.
So what I'm left with is not indexing these two columns despite that they would provide enormous savings in performance.
AM I MISSING ANYTHING?
I know what I want conceptually. I need an index that says "only treat these first three columns as unique for identifying, these other two are 'data' ". It's almost as though I want a sub-index to my primary index. This can't be a unique scenario. How do people solve it? MUST I give up the gains of having my most readily needed data and search criteria in the index? Is there some way internally in InnoDB that it can handle this so it's not the problem I think it is?
Any insight is much appreciated.