事务处理对持久统计信息自动收集的影响
- 2025-04-02 福建
本文字数:9091 字
阅读完需:约 30 分钟
事务处理对持久统计信息自动收集的影响
1. 持久化统计信息
持久统计信息将统计信息存储到磁盘,使它们在服务器重启后保持不变,优化器更有可能在查询时做出一致的选择,从而提高执行计划的稳定性。
在innodb_stats_persistent=ON(默认值)或表定义使用 stats_persistent=1时,优化器统计信息会持久化保存。
持久统计信息存储在 mysql.innodb_table_stats 和 mysql.innodb_index_stats 表中,last_update 列可以看到上次更新统计信息的时间。
系统变量innodb_stats_auto_recalc(默认 ON)控制表行更改超过 10%时,是否自动计算统计信息。也可以通过创建或更改表时指定stats_auto_recalc子句为单个表配置自动统计重新计算。
由于自动统计信息收集是一个后台线程,其处理过程与 DML 操作是异步的,在 DML 操作超过 10% 的表后,可能不会立即重新计算统计信息。在某些情况下,统计数据重新计算可能会延迟几秒钟。如果立即需要最新的统计信息,执行 ANALYZE TABLE 以启动统计信息的同步计算。
事务的 commit 和 rollback 会影响统计信息的自动收集么?通过下面测试,可以回答这问题。
2. 测试 commit 和 rollback 对持久统计信息收集的影响
测试环境的系统变量值:
greatsql> SHOW GLOBAL VARIABLES LIKE 'innodb_stats%';+--------------------------------------+-------------+| Variable_name | Value |+--------------------------------------+-------------+| innodb_stats_auto_recalc | ON || innodb_stats_include_delete_marked | OFF || innodb_stats_method | nulls_equal || innodb_stats_on_metadata | OFF || innodb_stats_persistent | ON || innodb_stats_persistent_sample_pages | 20 || innodb_stats_transient_sample_pages | 8 |+--------------------------------------+-------------+
2.1 INSERT 操作
greatsql> TRUNCATE TABLE test1;Query OK, 0 rows affected (0.05 sec)
-- 开启事务,在空表test1中插入10万行数据greatsql> BEGIN;Query OK, 0 rows affected (0.00 sec)
greatsql> SELECT now();INSERT INTO test1 SELECT * FROM LIMIT 100000;SELECT now();+---------------------+| now() |+---------------------+| 2025-01-07 09:59:19 |+---------------------+1 row in set (0.00 sec)
Query OK, 100000 rows affected (2.73 sec)Records: 100000 Duplicates: 0 Warnings: 0
+---------------------+| now() |+---------------------+| 2025-01-07 09:59:21 |+---------------------+1 row in set (0.00 sec)
-- 事务没有提交,但统计信息已收集greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| testdb1 | test1 | PRIMARY | 2025-01-07 09:59:19 | n_diff_pfx01 | 11 | 1 | id || testdb1 | test1 | PRIMARY | 2025-01-07 09:59:19 | n_leaf_pages | 1 | NULL | Number of leaf pages in the index || testdb1 | test1 | PRIMARY | 2025-01-07 09:59:19 | size | 1 | NULL | Number of pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:59:19 | n_diff_pfx01 | 11 | 1 | k || testdb1 | test1 | k_1 | 2025-01-07 09:59:19 | n_diff_pfx02 | 11 | 1 | k,id || testdb1 | test1 | k_1 | 2025-01-07 09:59:19 | n_leaf_pages | 1 | NULL | Number of leaf pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:59:19 | size | 1 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+7 rows in set (0.00 sec)
greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| testdb1 | test1 | PRIMARY | 2025-01-07 09:59:29 | n_diff_pfx01 | 98712 | 20 | id || testdb1 | test1 | PRIMARY | 2025-01-07 09:59:29 | n_leaf_pages | 1371 | NULL | Number of leaf pages in the index || testdb1 | test1 | PRIMARY | 2025-01-07 09:59:29 | size | 1379 | NULL | Number of pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:59:29 | n_diff_pfx01 | 30169 | 20 | k || testdb1 | test1 | k_1 | 2025-01-07 09:59:29 | n_diff_pfx02 | 100268 | 20 | k,id || testdb1 | test1 | k_1 | 2025-01-07 09:59:29 | n_leaf_pages | 128 | NULL | Number of leaf pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:59:29 | size | 161 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+7 rows in set (0.00 sec)
-- 回滚事务greatsql> ROLLBACK;Query OK, 0 rows affected (2.64 sec)
-- 没有重新收集统计信息,统计信息与表数据不匹配greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| testdb1 | test1 | PRIMARY | 2025-01-07 09:59:29 | n_diff_pfx01 | 98712 | 20 | id || testdb1 | test1 | PRIMARY | 2025-01-07 09:59:29 | n_leaf_pages | 1371 | NULL | Number of leaf pages in the index || testdb1 | test1 | PRIMARY | 2025-01-07 09:59:29 | size | 1379 | NULL | Number of pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:59:29 | n_diff_pfx01 | 30169 | 20 | k || testdb1 | test1 | k_1 | 2025-01-07 09:59:29 | n_diff_pfx02 | 100268 | 20 | k,id || testdb1 | test1 | k_1 | 2025-01-07 09:59:29 | n_leaf_pages | 128 | NULL | Number of leaf pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:59:29 | size | 161 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+7 rows in set (0.00 sec)
greatsql> SELECT COUNT(*) FROM test1;+----------+| count(*) |+----------+| 0 |+----------+1 row in set (0.00 sec)
-- analyze重新收集统计信息,统计信息才和表数据一致greatsql> ANALYZE TABLE test1;+---------------+---------+----------+----------+| Table | Op | Msg_type | Msg_text |+---------------+---------+----------+----------+| testdb1.test1 | analyze | status | OK |+---------------+---------+----------+----------+1 row in set (0.06 sec)
greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| testdb1 | test1 | PRIMARY | 2025-01-07 10:01:58 | n_diff_pfx01 | 0 | 1 | id || testdb1 | test1 | PRIMARY | 2025-01-07 10:01:58 | n_leaf_pages | 1 | NULL | Number of leaf pages in the index || testdb1 | test1 | PRIMARY | 2025-01-07 10:01:58 | size | 1 | NULL | Number of pages in the index || testdb1 | test1 | k_1 | 2025-01-07 10:01:58 | n_diff_pfx01 | 0 | 1 | k || testdb1 | test1 | k_1 | 2025-01-07 10:01:58 | n_diff_pfx02 | 0 | 1 | k,id || testdb1 | test1 | k_1 | 2025-01-07 10:01:58 | n_leaf_pages | 1 | NULL | Number of leaf pages in the index || testdb1 | test1 | k_1 | 2025-01-07 10:01:58 | size | 1 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+7 rows in set (0.00 sec)
2.2 DELETE 操作
greatsql> SELECT COUNT(*) FROM test1;+----------+| count(*) |+----------+| 100000 |+----------+1 row in set (0.15 sec)
-- 开启事务,执行delete操作greatsql> BEGIN;Query OK, 0 rows affected (0.00 sec)
greatsql> SELECT now();DELETE FROM test1;SELECT now();+---------------------+| now() |+---------------------+| 2025-01-07 09:41:36 |+---------------------+1 row in set (0.00 sec)
Query OK, 100000 rows affected (1.87 sec)
+---------------------+| now() |+---------------------+| 2025-01-07 09:41:38 |+---------------------+1 row in set (0.00 sec)
-- 在delete开始时,进行了一次统计信息收集greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| testdb1 | test1 | PRIMARY | 2025-01-07 09:41:36 | n_diff_pfx01 | 98712 | 20 | id || testdb1 | test1 | PRIMARY | 2025-01-07 09:41:36 | n_leaf_pages | 1371 | NULL | Number of leaf pages in the index || testdb1 | test1 | PRIMARY | 2025-01-07 09:41:36 | size | 1379 | NULL | Number of pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:41:36 | n_diff_pfx01 | 32313 | 20 | k || testdb1 | test1 | k_1 | 2025-01-07 09:41:36 | n_diff_pfx02 | 99244 | 20 | k,id || testdb1 | test1 | k_1 | 2025-01-07 09:41:36 | n_leaf_pages | 128 | NULL | Number of leaf pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:41:36 | size | 161 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+7 rows in set (0.00 sec)
-- delete完成一段时间后(约10秒)进行了第二次统计信息收集greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| testdb1 | test1 | PRIMARY | 2025-01-07 09:41:46 | n_diff_pfx01 | 0 | 20 | id || testdb1 | test1 | PRIMARY | 2025-01-07 09:41:46 | n_leaf_pages | 1371 | NULL | Number of leaf pages in the index || testdb1 | test1 | PRIMARY | 2025-01-07 09:41:46 | size | 1379 | NULL | Number of pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:41:46 | n_diff_pfx01 | 0 | 20 | k || testdb1 | test1 | k_1 | 2025-01-07 09:41:46 | n_diff_pfx02 | 0 | 20 | k,id || testdb1 | test1 | k_1 | 2025-01-07 09:41:46 | n_leaf_pages | 128 | NULL | Number of leaf pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:41:46 | size | 161 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+7 rows in set (0.00 sec)
-- 回滚事务greatsql> ROLLBACK;Query OK, 0 rows affected (1.95 sec)
-- 事务回滚后,统计信息与表数据不匹配greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| testdb1 | test1 | PRIMARY | 2025-01-07 09:41:46 | n_diff_pfx01 | 0 | 20 | id || testdb1 | test1 | PRIMARY | 2025-01-07 09:41:46 | n_leaf_pages | 1371 | NULL | Number of leaf pages in the index || testdb1 | test1 | PRIMARY | 2025-01-07 09:41:46 | size | 1379 | NULL | Number of pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:41:46 | n_diff_pfx01 | 0 | 20 | k || testdb1 | test1 | k_1 | 2025-01-07 09:41:46 | n_diff_pfx02 | 0 | 20 | k,id || testdb1 | test1 | k_1 | 2025-01-07 09:41:46 | n_leaf_pages | 128 | NULL | Number of leaf pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:41:46 | size | 161 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+7 rows in set (0.00 sec)
greatsql> SELECT COUNT(*) FROM test1;+----------+| count(*) |+----------+| 100000 |+----------+1 row in set (0.15 sec)
-- analyze重新收集统计信息,统计信息才和表数据一致greatsql> ANALYZE TABLE test1;+---------------+---------+----------+----------+| Table | Op | Msg_type | Msg_text |+---------------+---------+----------+----------+| testdb1.test1 | analyze | status | OK |+---------------+---------+----------+----------+1 row in set (0.08 sec)
greatsql> SELECT * FROM mysql.innodb_index_stats WHERE database_name='testdb1' AND table_name='test1';+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+| testdb1 | test1 | PRIMARY | 2025-01-07 09:47:29 | n_diff_pfx01 | 98712 | 20 | id || testdb1 | test1 | PRIMARY | 2025-01-07 09:47:29 | n_leaf_pages | 1371 | NULL | Number of leaf pages in the index || testdb1 | test1 | PRIMARY | 2025-01-07 09:47:29 | size | 1379 | NULL | Number of pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:47:29 | n_diff_pfx01 | 32332 | 20 | k || testdb1 | test1 | k_1 | 2025-01-07 09:47:29 | n_diff_pfx02 | 100051 | 20 | k,id || testdb1 | test1 | k_1 | 2025-01-07 09:47:29 | n_leaf_pages | 128 | NULL | Number of leaf pages in the index || testdb1 | test1 | k_1 | 2025-01-07 09:47:29 | size | 161 | NULL | Number of pages in the index |+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+7 rows in set (0.01 sec)
3. 总结
数据量变化大(超过 10%)的 DML 操作会导致 2 次统计信息收集,一次是 DML 开始时,一次是 DML 完成约 10 秒后。
DML 操作是否
COMMIT提交,不影响统计信息收集。DML 操作的
rollback回滚,可能造成统计信息与表数据不一致。当大数据 DML 操作回滚后,可以执行ANALYZE TABLE重新收集表的统计信息。
版权声明: 本文为 InfoQ 作者【GreatSQL】的原创文章。
原文链接:【http://xie.infoq.cn/article/1884855bdaffce9c4b3640e3a】。文章转载请联系作者。
GreatSQL
GreatSQL社区 2023-01-31 加入
GreatSQL是由万里数据库维护的MySQL分支,专注于提升MGR可靠性及性能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版本。 社区:https://greatsql.cn/ Gitee: https://gitee.com/GreatSQL/GreatSQL









评论