写点什么

MySQL 派生条件下推优化导致自定义变量结果错误问题分析

作者:GreatSQL
  • 2025-05-22
    福建
  • 本文字数:3359 字

    阅读完需:约 11 分钟

1. 问题描述

现场 MySQL 8.0.25 版本运行中,发现带有用户变量的 SQL 执行结果集与预期不符。复现场景如下:


-- 起锅烧油,连接 MySQL 8.0.25 数据库mysql -uroot -S /var/lib/mysql/mysql-8.0.25-linux-x86_64/mysql-test/var/tmp/mysqld.1.sock -p -- 执行以下SQLSELECT version();CREATE database testdb;use testdbCREATE TABLE t1(c1 int);INSERT INTO t1 VALUES(100),(200),(300),(400),(500);SELECT * FROM (  SELECT c1, (@rownum_r:=@rownum_r+1) AS r   FROM t1, (SELECT @rownum_r:=0) AS b) AS q WHERE q.c1=300;   -- 这里查询到的结果是300  1,实际应该是300  3
复制代码


分析上述的执行结果中的 rownum 返回 1,可以推断派生表返回的条数少于预期,导致 rownum 值偏小。由此想到看下优化器改写后的 SQL


mysql> explain SELECT * FROM (SELECT c1, (@rownum_r := @rownum_r + 1) AS r FROM  t1, (SELECT @rownum_r := 0) AS b) AS q WHERE q.c1 = 300;+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+| id | select_type | table      | partitions | type   | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+|  1 | PRIMARY     | <derived2> | NULL       | ALL    | NULL          | NULL | NULL    | NULL |    2 |   100.00 | NULL           ||  2 | DERIVED     | <derived3> | NULL       | system | NULL          | NULL | NULL    | NULL |    1 |   100.00 | NULL           ||  2 | DERIVED     | t1         | NULL       | ALL    | NULL          | NULL | NULL    | NULL |    5 |    20.00 | Using where    ||  3 | DERIVED     | NULL       | NULL       | NULL   | NULL          | NULL | NULL    | NULL | NULL |     NULL | No tables used |+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+4 rows in set, 3 warnings (0.01 sec)
mysql> SHOW warnings;+---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| Level | Code | Message |+---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+.......| Note | 1003 | /* select#1 */ select `q`.`c1` AS `c1`,`q`.`r` AS `r` from (/* select#2 */ select `testdb`.`t1`.`c1` AS `c1`,(@rownum_r:=((@`rownum_r`) + 1)) AS `r` from `testdb`.`t1` where (`testdb`.`t1`.`c1` = 300)) `q` |+---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
复制代码


印证了上面的推断,优化器将 WHERE 条件下推到派生表中,减少了派生表返回行数,以此加快 SQL 执行速度。但是引起了 rownum 递增减少问题。


由此可知这么来看,属于优化器的 Bug,而在 Bug home 中也找到了现象匹配的问题


  • Derived condition pushdown rewrite ignores user variables https://bugs.mysql.com/bug.php?id=104918

2. 问题分析

2.1 什么是派生条件下推

引自 10.2.2.5 Derived Condition Pushdown Optimization 的相关描述:


对于 SELECT * FROM (SELECT i, j FROM t1) AS dt WHERE i > constant 这种子查询语句,大多情况下可以将外层的 where 条件下推到派生表内部,改写为 SELECT * FROM (SELECT i, j FROM t1 WHERE i > constant) AS dt,来减少派生表返回条数,达到提升执行效率的作用。


比如以下场景,就很适用 WHERE 条件下推:


  • 当派生表不使用聚合或窗口函数时,可以直接向下推送外部 WHERE 条件。这包括具有多个与 AND、OR 或两者结合的谓词的 WHERE 条件。比如 SELECT * FROM (SELECT f1, f2 FROM t1) AS dt WHERE f1 < 3 AND f2 > 11 改写为 SELECT f1, f2 FROM (SELECT f1, f2 FROM t1 WHERE f1 < 3 AND f2 > 11) AS dt

  • 当派生表具有 GROUP BY 并且不使用窗口函数时,引用一个或多个不属于 GROUP BY 的列的外部 WHERE 条件可以作为 HAVING 条件下推到派生表中。比如 SELECT * FROM (SELECT i, j, SUM(k) AS sum FROM t1 GROUP BY i, j) AS dt WHERE sum > 100 被改写为SELECT * FROM (SELECT i, j, SUM(k) AS sum FROM t1 GROUP BY i, j HAVING sum > 100) AS dt.

  • 当派生表使用 GROUP BY 并且外部 WHERE 条件中的列是 GROUP BY 列时,引用这些列的 WHERE 条件可以直接下推到派生表中。比如 SELECT * FROM (SELECT i,j, SUM(k) AS sum FROM t1 GROUP BY i,j) AS dt WHERE i > 10 被改写为 SELECT * FROM (SELECT i,j, SUM(k) AS sum FROM t1 WHERE i > 10 GROUP BY i,j) AS dt.


derived_condition_pushdown 开关在 optimizer_switch 参数中设置,也可以使用相关 hint 控制在语句级生效。方法如下:


-- 开启:SET optimizer_switch="derived_condition_pushdown=on";SELECT /*+ DERIVED_CONDITION_PUSHDOWN() */ ... FROM ...
-- 关闭:SET optimizer_switch="derived_condition_pushdown=off";SELECT /*+ NO_DERIVED_CONDITION_PUSHDOWN() */ ... FROM ...
复制代码

2.2 Bug 影响范围有多大

经过查阅 releasenote,可知条件下推是在 8.0.22 中引入的,修复的版本是 8.0.28


-- 8.0.22的发布日志:MySQL now implements derived condition pushdown for eligible queries. What this means is that, for a query such as SELECT * FROM (SELECT i, j FROM t1) AS dt WHERE i > constant, it is now possible in many cases to push the outer WHERE condition down to the derived table, in this case resulting in SELECT * FROM (SELECT i, j FROM t1 WHERE i > constant) AS dt. Previously, if the derived table was materialized and not merged, MySQL materialized the entire table—in this case t1—then qualified the rows with the WHERE condition.
-- 8.0.28的修复日志:Documented fix as follows in the MySQL 8.0.28 changelog:
When a condition was pushed down, the result of evaluating assignments to user variables in the SELECT list of the subquery were sometimes affected. For this reason, we now prevent condition pushdown for statements with assignments to user variables.
复制代码

3. 业务规避措施

  1. 升级到 MySQL 8.0.28 及以上版本,或者 GreatSQL 分支版本 https://gitee.com/GreatSQL/GreatSQL/releases/tag/GreatSQL-8.0.32-27。

  2. 数据库实例级别关闭该优化,set optimizer_switch="derived_condition_pushdown=off";

  3. 在涉及的 SQL 语句上添加 hint /*+ NO_DERIVED_CONDITION_PUSHDOWN() */ ,语句级别禁用derived_condition_pushdown 优化,可以规避 MySQL Bug 104918

参考文章

  1. MySQL 8.0.22 releasenote:https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-22.html

  2. Derived condition pushdown rewrite ignores user variables https://bugs.mysql.com/bug.php?id=104918

  3. Derived Condition Pushdown Optimization https://dev.mysql.com/doc/refman/8.4/en/derived-condition-pushdown-optimization.html

  4. GreatSQL 分支版本 https://gitee.com/GreatSQL/GreatSQL/releases/tag/GreatSQL-8.0.32-27

发布于: 刚刚阅读数: 2
用户头像

GreatSQL

关注

GreatSQL社区 2023-01-31 加入

GreatSQL是由万里数据库维护的MySQL分支,专注于提升MGR可靠性及性能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版本。 社区:https://greatsql.cn/ Gitee: https://gitee.com/GreatSQL/GreatSQL

评论

发布
暂无评论
MySQL派生条件下推优化导致自定义变量结果错误问题分析_GreatSQL_InfoQ写作社区