写点什么

GreatSQL 死锁案例分析

作者:GreatSQL
  • 2024-04-19
    福建
  • 本文字数:6674 字

    阅读完需:约 22 分钟

1.背景概述

客户业务发生死锁的报错,根据业务程序日志及业务流程,发现造成死锁的原因是:事务 1 delete + insert ,事务 2 delete + insert 2 个事务交替执行导致的死锁;由于 GAP 锁阻塞了插入意向锁,并且当 delete 的数据存在时死锁不会发生,当 delete 的数据不存在时,会发生死锁。

2.问题复现

本次测试基于 GreatSQL-8.0.32-24,隔离级别为 RR

2.1 创建测试表

greatsql> create database test;
greatsql> create table test(c1 int unique key, c2 int, c3 int, c4 int);
greatsql> insert into test values (1,1,1,1),(3,3,3,3),(5,5,5,5),(9,9,9,9);
greatsql> select * from test;+------+------+------+------+| c1 | c2 | c3 | c4 |+------+------+------+------+| 1 | 1 | 1 | 1 || 3 | 3 | 3 | 3 || 5 | 5 | 5 | 5 || 9 | 9 | 9 | 9 |+------+------+------+------+4 rows in set (0.01 sec)
复制代码

2.2 事务执行顺序

按以下事务执行顺序,如果要删除的数据存在,则不会发生死锁;如果要删除的数据不存,并且要删除的数据在同一个 GAP 锁的区间内则会发生死锁;


2.3 当 delete 的数据存在时

事务 1:delete


greatsql> begin;Query OK, 0 rows affected (0.00 sec)
greatsql> delete from test where c1=3;Query OK, 1 row affected (0.00 sec)
复制代码


查看锁信息:


greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+| 59 | test | test | NULL | TABLE | IX | GRANTED | NULL || 59 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 3, 0x000000000201 || 59 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000201 |+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+3 rows in set (0.00 sec)
复制代码


此时事务 1 给 3, 0x000000000201 这条数据加了 记录锁 X,REC_NOT_GAP


事务 2:delete


greatsql> begin;Query OK, 0 rows affected (0.00 sec)
greatsql> delete from test where c1=5;Query OK, 1 row affected (0.00 sec)
复制代码


查看锁信息:


greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+| 57 | test | test | NULL | TABLE | IX | GRANTED | NULL || 57 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 5, 0x000000000202 || 57 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000202 || 59 | test | test | NULL | TABLE | IX | GRANTED | NULL || 59 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 3, 0x000000000201 || 59 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000201 |+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+6 rows in set (0.00 sec)
复制代码


此时事务 2 给 5, 0x000000000202 这条数据加了 记录锁 X,REC_NOT_GAP


事务 1:insert


greatsql> insert into test value(3,3,3,3);Query OK, 1 row affected (0.00 sec)
复制代码


查看锁信息:


greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+
| 57 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 57 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 5, 0x000000000202 |
| 57 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000202 |
| 59 | test | test | NULL | TABLE | IX | GRANTED | NULL |
| 59 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 3, 0x000000000201 |
| 59 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000201 |
| 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 3, 0x000000000201 |
| 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 5, 0x000000000202 |
| 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 3, 0x000000000206 |
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+9 rows in set (0.00 sec)
复制代码


此时事务 1 给 被 delete 删除的数据 3, 0x000000000201 ,插入的数据 3, 0x000000000202,以及相邻的下一条数据 5, 0x000000000202 加了 间隙锁 S,GAP


事务 2:insert


greatsql> insert into test value(5,5,5,5);Query OK, 1 row affected (0.01 sec)
复制代码


查看锁信息:


greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+| 57 | test | test | NULL | TABLE | IX | GRANTED | NULL || 57 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 5, 0x000000000202 || 57 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000202 || 57 | test | test | c1 | RECORD | S,GAP | GRANTED | 5, 0x000000000202 || 57 | test | test | c1 | RECORD | S,GAP | GRANTED | 9, 0x000000000203 || 57 | test | test | c1 | RECORD | S,GAP | GRANTED | 5, 0x000000000207 || 59 | test | test | NULL | TABLE | IX | GRANTED | NULL || 59 | test | test | c1 | RECORD | X,REC_NOT_GAP | GRANTED | 3, 0x000000000201 || 59 | test | test | GEN_CLUST_INDEX | RECORD | X,REC_NOT_GAP | GRANTED | 0x000000000201 || 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 3, 0x000000000201 || 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 5, 0x000000000202 || 59 | test | test | c1 | RECORD | S,GAP | GRANTED | 3, 0x000000000206 |+-----------+---------------+-------------+-----------------+-----------+---------------+-------------+-------------------+12 rows in set (0.00 sec)
复制代码


此时事务 2 给 被 delete 删除的数据 5, 0x000000000202 ,插入的数据 5, 0x000000000207,以及相邻的下一条数据 9, 0x000000000203 加了 间隙锁 S,GAP


由于 GAP 锁之间是相互兼容的,所以没有发生锁等待及死锁,此时事务 1,事务 2 都执行完成,可以正常提交。

2.4 当 delete 的数据不存在时

事务 1:delete


greatsql> begin;Query OK, 0 rows affected (0.01 sec)
greatsql> delete from test where c1=6;Query OK, 0 rows affected (0.00 sec)
复制代码


查看锁信息:


greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+| 61 | test | test | NULL | TABLE | IX | GRANTED | NULL || 61 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 |+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+2 rows in set (0.00 sec)
复制代码


此时事务 1 给 9, 0x000000000203 这条数据加了 间隙锁 X,GAP


事务 2:delete


greatsql> begin;Query OK, 0 rows affected (0.00 sec) greatsql> delete from test where c1=7;Query OK, 0 rows affected (0.00 sec)
复制代码


查看锁信息:


greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA     |+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+|     62 | test      | test     | NULL    | TABLE   | IX     | GRANTED   | NULL        ||     62 | test      | test     | c1     | RECORD   | X,GAP   | GRANTED   | 9, 0x000000000203 ||     61 | test      | test     | NULL    | TABLE   | IX     | GRANTED   | NULL        ||     61 | test      | test     | c1     | RECORD   | X,GAP   | GRANTED   | 9, 0x000000000203 |+-----------+---------------+-------------+------------+-----------+-----------+-------------+-------------------+4 rows in set (0.00 sec)
复制代码


此时事务 2 给 9, 0x000000000203 这条数据加了 间隙锁 X,GAP 间隙锁可以相互兼容,因此没有报错


事务 1:insert


greatsql> insert into test value(6,6,6,6);---hang住,处于锁等待
复制代码


查看锁信息:


greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+| 62 | test | test | NULL | TABLE | IX | GRANTED | NULL || 62 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 || 61 | test | test | NULL | TABLE | IX | GRANTED | NULL || 61 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 || 61 | test | test | c1 | RECORD | X,GAP,INSERT_INTENTION | WAITING | 9, 0x000000000203 |+-----------+---------------+-------------+------------+-----------+------------------------+-------------+-------------------+5 rows in set (0.00 sec)
复制代码


此时事务 1,要申请给 9, 0x000000000203 这条数据加 间隙锁,插入意向锁 X,GAP,INSERT_INTENTION;由于事务 2 已经加了 间隙锁 X,GAP 间隙锁与插入意向锁并不兼容,因此事务 1 的 insert 处于锁等待状态


事务 2:insert


greatsql> insert into test value(7,7,7,7);ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
复制代码


查看锁信息:


greatsql> select THREAD_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+| THREAD_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+| 61 | test | test | NULL | TABLE | IX | GRANTED | NULL || 61 | test | test | c1 | RECORD | X,GAP | GRANTED | 9, 0x000000000203 || 61 | test | test | c1 | RECORD | X,GAP | GRANTED | 6, 0x000000000213 || 61 | test | test | GEN_CLUST_INDEX | RECORD | X | GRANTED | supremum pseudo-record || 61 | test | test | GEN_CLUST_INDEX | RECORD | X,GAP | GRANTED | 0x000000000213 || 61 | test | test | c1 | RECORD | X,GAP,INSERT_INTENTION | GRANTED | 9, 0x000000000203 |+-----------+---------------+-------------+-----------------+-----------+------------------------+-------------+------------------------+6 rows in set (0.01 sec)
复制代码


事务 2 回滚,只有事务 1 的加锁信息。


由于此时事务 2,要申请给 9, 0x000000000203 这条数据加 间隙锁,插入意向锁由于事务 1 已经加了 间隙锁 X,GAP 间隙锁与插入意向锁并不兼容,因此事务 2 的 insert 处于锁等待状态,2 个事务相互等待锁导致死锁,此时事务 2 回滚。

3.总结

此次死锁的发生主要是 GAP 锁 和 插入意向锁的冲突,建议让业务修改一下逻辑,先判断数据是否存在 select 一下,存在的话 delete 后在 insert ; 不存在的话直接 insert 不用 delete 了。


发布于: 1 小时前阅读数: 6
用户头像

GreatSQL

关注

GreatSQL社区 2023-01-31 加入

GreatSQL是由万里数据库维护的MySQL分支,专注于提升MGR可靠性及性能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版本。 社区:https://greatsql.cn/ Gitee: https://gitee.com/GreatSQL/GreatSQL

评论

发布
暂无评论
GreatSQL 死锁案例分析_GreatSQL_InfoQ写作社区