Re: Redundant InnoDB lock
Hello, Jakub!
Thanks so much for such a detailed answer.
I no longer hoped that someone would answer here and therefore created a bug report #98639 about this issue.
You've written:
> I wouldn't use the word "bug", because there is no data loss, wrong answer, etc. - just less concurrency than one could imagine in a different implementation (which might have other issues in other scenarios). And I'm reluctant to even use the word "inefficiency", given that "number of resources locked" is just one axis, some of others being "simplicity of design" and "speed of development".
ОК. It's an inefficiency issue. By the way, when I created a bug report, I was thinking about reducing its severity from S3 to S5.
> locking happens even if the index on 'b' is UNIQUE. What matters is rather that it is a secondary (=not PRIMARY) index.
Thanks for correction.
> if the index is non-unique, we have additional fun, of having, say (b=3,a=5),(b=3,a=3),(b=3,a=2) in it, and having to figure out which (if any) of the "siblings" to lock, which gaps, etc.
The only important thing is which sibling will be closest to the scanned range. I checked that only the closest sibling may be blocked. If the uniqueness of the index played a role, then either all the siblings or none of them would be blocked.
> Also, the index can be defined on multiple columns (KEY (b,c)) while the query can reference just some prefix of it (WHERE b<3), or even a prefix of a column if it is a string.
In this case, everything should work the same way as in the case of a non-unique index. No additional problems would arise here.
> If you declare the index as DESC then it has tuples sorted in the descending order
> The scan itself can be ascending or descending depending on the ORDER BY clause.
Good idea! I've performed additional tests and found out the following (with the RR isolation level).
SELECT * FROM foo WHERE b < 0 FOR UPDATE;
Sets an ordinary X-lock on (b=1,a=1) tuple. The correct lock type should be X,GAP (that is gap only).
SELECT * FROM foo WHERE b < 0 ORDER BY b DESC FOR UPDATE;
Sets the correct X,GAP lock on the (b=1,a=1). Suddenly!
SELECT * FROM foo WHERE b > 10 FOR UPDATE;
Sets the correct X-lock on the supremum pseudo-record, nothing else as needed.
SELECT * FROM foo WHERE b > 10 ORDER BY b DESC FOR UPDATE;
In addition to blocking the supremum, it sets an unnecessary X-lock on (b=9,a=9). Now there will be unreasonable problems with INSERTs into a gap between 7 and 9 (in addition to the problem of deleting the last index record).
Now let the index 'b' be descending.
ALTER TABLE foo DROP INDEX b;
CREATE INDEX b ON foo (b DESC);
SELECT * FROM foo WHERE b < 0 FOR UPDATE;
Sets the correct X-lock on the supremum pseudo-record, nothing else as needed (now the supremum is some big negative number).
SELECT * FROM foo WHERE b < 0 ORDER BY b ASC FOR UPDATE;
In addition to blocking the supremum, it sets an unnecessary X-lock on (b=1,a=1). Very bad.
SELECT * FROM foo WHERE b > 10 FOR UPDATE;
Sets an ordinary X-lock on (b=9,a=9) tuple. The correct lock type should be X,GAP.
SELECT * FROM foo WHERE b > 10 ORDER BY b ASC FOR UPDATE;
Sets the correct X,GAP lock on the (b=9,a=9).
Now it's easy to see that the problem only occurs when the index scan completes by going beyond a scanned range. In this case, the algorithm most likely sets the unconditional X-lock on the first index record, for which the scan condition ceases to be satisfied. The problem does not occur at the beginning of the index scan. It occurs only at the end. I think it’s easy to change the algorithm to set an extra lock at the end of the scan in the same way as at the beginning of the scan. Unfortunately, I am not a C++ programmer and cannot find the right place in the code that needs to be fixed. I don't understand this code.