MySQL InnoDB Record Lock

Recently, I encountered an issue that appeared to be a deadlock caused by an InnoDB Record Lock and had to investigate it. Here, I summarize the behavior and implementation of InnoDB's Index Tree and Record Lock, which I organized during that investigation.

How InnoDB stores Index

First, let's explain how InnoDB handles indexes in terms of data structure.

It is well-known that InnoDB indexes are implemented as B-trees (B+-trees). When InnoDB began to be implemented in the 1990s, SSDs were not mainstream, and it was impossible to fit all data into memory. Therefore, B-trees were chosen to efficiently store data on disk.

Strictly speaking, there are two types of indexes: Clustered Indexes and Secondary Indexes. Clustered Indexes use the primary key as the node key and store the actual data values in the leaf nodes. All MySQL tables have this Clustered Index.

On the other hand, Secondary Indexes are created when an index is set, and the leaf nodes hold references to the primary key nodes of the Clustered Index (they do not hold the actual data). Secondary Indexes are used to improve search efficiency for specific search patterns.

MySQL InnoDB Clustered Index

InnoDB Index Tree

COLUMN: All changes made to the index tree are stored in the WAL (Write Ahead Log), or Redo Log in InnoDB terminology. These are append-only logs that store all the history of changes. Thanks to this Redo Log, InnoDB can ensure data consistency before and after outages.

Record Locks

To check the latest transactions, you can use the output of SHOW ENGINE INNODB STATUS. In addition to the latest transactions, you can also check foreign key errors, detected deadlocks, transactions, and File I/O.

Here, let's consider what a Record Lock specifically represents. A Record Lock is a shared or exclusive lock on the leaf nodes of the aforementioned Index Tree.

MySQL InnoDB Record Lock

Record Lock Behavior

COLUMN: InnoDB supports other types of locks as well, such as "Gap Locks" and "Next-Key Locks". Their behaviors can be easily understood once you realize that InnoDB stores indexes as B-trees and locks occur there. For example, Gap Locks are mechanisms to acquire locks between one index record and another (hence the name "gap"). As for Next-Key Locks, it is (Gap Lock) + (Record Lock of the next key).

Lock Modes

There are two types of lock modes: Shared lock (S) and Exclusive lock (X).

In short, a Shared lock can be shared among multiple threads to read the locked index record. On the other hand, while an Exclusive Lock is acquired on an index record, no other thread can perform any operations.

For example, transactions A and B attempt to acquire a lock on a specific index record. The following table shows whether operations on the index record conflict depending on which lock mode each transaction uses. If either thread acquires an Exclusive Lock, the other thread will always be blocked.

transaction A \ transaction B acquire S acquire X
acquire S Non-blocked :) Conflicts :(
acquire X Conflicts :( Conflicts :(

Implementation

The actual implementation of locks uses "latch". The data structure is essentially a custom implementation of RW lock.

storage/innobase/lock/lock0lock.cc

The latch is sharded across multiple rw_lock_t instances to distribute the load across multiple cache lines. Hence, the prefix Sharded_. Note that this is not a typo of "Shared Lock."

storage/innobase/include/sync0shared_rw.h

An interesting point is how Shared Lock/Exclusive Lock is implemented using this RW-Lock. Internally, it is represented as an array with multiple RW-Locks. An Exclusive lock acquires all the RW-Locks in this array, while a Shared lock acquires any one of the RW-Locks in this array.

2021-01-12