Case 1: very sparse; 500 different flags; average of 3-20 flags set:
CREATE TABLE x (
id INT UNSIGNED NOT NULL COMMENT 'JOIN to main table',
flag SMALLINT UNSIGNED NOT NULL COMMENT 'flag number (1-500)',
PRIMARY KEY(id, flag), -- for INSERT & DELETE
INDEX(flag, id) -- for SELECT
);
# For all flags set:
SELECT ...
FROM MainTable m
JOIN ( SELECT id
FROM (
( SELECT id FROM x WHERE flag = 23 )
UNION ALL
( SELECT id FROM x WHERE flag = 196 )
) x
GROUP BY id
HAVING COUNT(*) = 2
) y ON x.id = m.id
...
# For all flags reset:
SELECT ...
FROM MainTable m
LEFT JOIN x ON x.id = m.id
AND flag IN (23, 196)
WHERE x.id IS NULL -- "none of the flags are set"
...
Adjust INT, SMALLINT to MEDIUMINT/TINYINT if appropriate.
Construct the SELECTs in the UNION dynamically.
Construct the IN(...) and the HAVING count dynamically.
Case 2: not sparse; 500 different flags; 100-400 flags set:
This will be best done via a "table scan":
CREATE TABLE x (
id INT UNSIGNED NOT NULL COMMENT 'JOIN to main table',
group1 INT UNSIGNED NOT NULL COMMENT 'some of the flags',
group2 INT UNSIGNED NOT NULL COMMENT 'some more flags',
group3 SMALLINT UNSIGNED NOT NULL COMMENT 'the last few flags',
PRIMARY KEY (id), -- for INSERT & DELETE
);
# For checking several flags being all set:
SELECT ...
FROM MainTable m
JOIN ( SELECT id
FROM x
WHERE ((group2 & 0x305f) = 0x305f)
AND ((group5 & 0xe41) = 0xe41)
) x ON x.id = m.id
...
# Checking that several flags are reset:
WHERE ((group2 & 0x305f) = 0)
AND ((group5 & 0xe41) = 0)
# This design even allows for a mixture of set and reset:
WHERE ((group2 & 0x305f) = 0x0019)
AND ((group5 & 0xe41) = 0xc40)
* The 500 flags are broken up into groups.
* Each group has no more than 32 flags.
* When a 'group' is 24 bits or smaller, pick MEDIUMINT/SMALLINT/TINYINT.
* You have to build the WHERE clause.
* If many queries use _only_ group1, then add
INDEX(group1, id)
Performance note:
For case 2, I chose to build a separate table instead of adding the new fields to the existing table (MainTable). Why? Because the table scan will have to read all the data; this is probably faster in a smaller table. (`x` is smaller than `MainTable` because of having fewer columns.)