写点什么

一条 like 条件的慢 SQL 语句优化

  • 2022 年 7 月 11 日
  • 本文字数:5250 字

    阅读完需:约 17 分钟

作者: jiyf 原文来源:https://tidb.net/blog/d4593315


【是否原创】是


【首发渠道】TiDB 社区

问题背景

遇到下面这条慢 sql,查询计划走了高效的唯一索引,满足条件的行数只有 10 行,但是耗时却达到了 10s。


mysql> select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name like 'dir_abc4999/%'  order by name asc limit 501;                                 +----------------------------------+--------+-------+----------------------------------+| name                             | uid    | size  | etag                             |+----------------------------------+--------+-------+----------------------------------+| dir_abc4999/obj_20211027_1_0.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_1.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_2.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_3.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_4.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_5.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_6.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_7.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_8.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_9.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |+----------------------------------+--------+-------+----------------------------------+10 rows in set (10.08 sec)
复制代码


表有一个唯一性的联合索引,可以完好地覆盖查询条件列 bucket_id 和 name,从表结构和 sql 语句第一印象是这条 sql 应该是非常快的,但是却出现了上面的结果。


mysql> show create table objects\"G*************************** 1. row ***************************       Table: objectsCreate Table: CREATE TABLE `objects` (  `id` bigint(20) NOT NULL /*T![auto_rand] AUTO_RANDOM(5) */,  `bucket_id` varchar(192) NOT NULL,  `name` varchar(512) NOT NULL,  `version_id` varchar(64) NOT NULL,  ...  ...  PRIMARY KEY (`id`),  UNIQUE KEY `idx` (`bucket_id`,`name`,`version_id`),  KEY `cluster_id` (`cluster_id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin /*T![auto_rand_base] AUTO_RANDOM_BASE=2313570001 */1 row in set (0.00 sec)
复制代码

问题排查

看查询计划好像没有太大问题,走了联合索引做范围查找,最后回表查询返回结果,按道理这个执行计划非常好,那问题在哪里呢。


mysql> explain select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name like 'dir_abc4999/%'  order by name asc limit 501\"G*************************** 1. row ***************************           id: Projection_7      estRows: 501.00         task: rootaccess object: operator info: ogw.objects.name, ogw.objects.uid, ogw.objects.size, ogw.objects.etag*************************** 2. row ***************************           id: └─Projection_35      estRows: 501.00         task: rootaccess object: operator info: ogw.objects.bucket_id, ogw.objects.name, ogw.objects.uid, ogw.objects.size, ogw.objects.etag*************************** 3. row ***************************           id:   └─IndexLookUp_34      estRows: 501.00         task: rootaccess object: operator info: limit embedded(offset:0, count:501)*************************** 4. row ***************************           id:     ├─Limit_33(Build)      estRows: 501.00         task: cop[tikv]access object: operator info: offset:0, count:501*************************** 5. row ***************************           id:     │ └─Selection_32      estRows: 501.00         task: cop[tikv]access object: operator info: like(ogw.objects.name, "dir_abc4999/%", 92)*************************** 6. row ***************************           id:     │   └─IndexRangeScan_30      estRows: 501.00         task: cop[tikv]access object: table:objects, index:idx(bucket_id, name, version_id)operator info: range:(".bucket.meta.cmu-bucket01" "dir",".bucket.meta.cmu-bucket01" "dis"), keep order:true*************************** 7. row ***************************           id:     └─TableRowIDScan_31(Probe)      estRows: 501.00         task: cop[tikv]access object: table:objectsoperator info: keep order:false, stats:pseudo7 rows in set (0.00 sec)
复制代码


唯一奇怪的在于上面执行计划中的第 6 行 IndexRangeScan_30 中,like 算子做范围查询的范围,计划里选择的是 [“dir”, “dis”],按道理应该是 [“dir_abc4999/”, “dir_abc49990”] 才对呀(因为 ‘/’ 下一个字符是 ‘0’),满足 [“dir”, “dis”] 范围的行数竟然达到 2 亿 + 条,难怪会这么慢。


mysql> select count(1) from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name >= 'dir' and name < 'dis';+-----------+| count(1)  |+-----------+| 233596134 |+-----------+1 row in set (7.21 sec)
复制代码


从这里不免怀疑是 tidb 执行计划这里出了问题,错误的选择了 index 的 range 范围,导致扫描了不必要的 keys,难道是 tidb 执行计划的 bug 问题?


改写 sql,将出问题的 sql 改写为它的等价 sql,耗时 0.00 秒!!


mysql> select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name >= 'dir_abc4999/' and name <='dir_abc49990'  order by name asc limit 501;  +----------------------------------+--------+-------+----------------------------------+| name                             | uid    | size  | etag                             |+----------------------------------+--------+-------+----------------------------------+| dir_abc4999/obj_20211027_1_0.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_1.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_2.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_3.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_4.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_5.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_6.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_7.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_8.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_9.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |+----------------------------------+--------+-------+----------------------------------+10 rows in set (0.00 sec)
复制代码


查看改写后 sql 的执行计划,执行计划相同,但是这里联合索引 name 列的范围变为了 [“dir_abc4999/”, “dir_abc49990”],这里扫描的 keys 数量紧紧为 10 个。


mysql> explain select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name >= 'dir_abc4999/' andname <= 'dir_abc49990'  order by name asc limit 501\"G*************************** 1. row ***************************           id: Projection_7      estRows: 1.00         task: rootaccess object: operator info: ogw.objects.name, ogw.objects.uid, ogw.objects.size, ogw.objects.etag*************************** 2. row ***************************           id: └─Projection_32      estRows: 1.00         task: rootaccess object: operator info: ogw.objects.bucket_id, ogw.objects.name, ogw.objects.uid, ogw.objects.size, ogw.objects.etag*************************** 3. row ***************************           id:   └─IndexLookUp_31      estRows: 1.00         task: rootaccess object: operator info: limit embedded(offset:0, count:501)*************************** 4. row ***************************           id:     ├─Limit_30(Build)      estRows: 1.00         task: cop[tikv]access object: operator info: offset:0, count:501*************************** 5. row ***************************           id:     │ └─IndexRangeScan_28      estRows: 1.00         task: cop[tikv]access object: table:objects, index:idx(bucket_id, name, version_id)operator info: range:[".bucket.meta.cmu-bucket01" "dir_abc4999/",".bucket.meta.cmu-bucket01" "dir_abc49990"], keep order:true*************************** 6. row ***************************           id:     └─TableRowIDScan_29(Probe)      estRows: 1.00         task: cop[tikv]access object: table:objectsoperator info: keep order:false, stats:pseudo6 rows in set (0.00 sec)
复制代码


看到这里似乎找到了问题的原因,是 tidb 的 like 算子不准,导致执行计划扫描过多的数据,tidb 在这里存在 bug,通过改写这个 like sql 到后面那个来规避这个问题吧!!!!

真实原因

关于 sql 的 like 语句,’_’ 符号竟然是个通配符!!!!


With LIKE you can use the following two wildcard characters in the pattern:

  • % matches any number of characters, even zero characters.

  • _ matches exactly one character.


mysql 官方文档关于 like 语法的描述:链接


也就是说 name like ‘dir_abc4999/%’ 的条件中,’_‘ 可以为任何字符,这样的话匹配的前缀只有 “dir”,tidb 选择的 range 范围恰好是正确的!!!


这里业务错误的拿 ’_‘ 符号来进行精确匹配,出现了误用 like 语法中特殊字符的问题。


这里将 ’_’ 进行转义,问题同样也就可以解决了:


mysql> select name, uid, size, etag from objects where bucket_id = '.bucket.meta.cmu-bucket01' and name like 'dir\"_abc4999/%'  order by name asc limit 501;+----------------------------------+--------+-------+----------------------------------+| name                             | uid    | size  | etag                             |+----------------------------------+--------+-------+----------------------------------+| dir_abc4999/obj_20211027_1_0.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_1.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_2.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_3.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_4.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_5.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_6.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_7.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_8.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e || dir_abc4999/obj_20211027_1_9.txt | test01 | 61441 | 79702d022dafea0173dc9bcbce1b562e |+----------------------------------+--------+-------+----------------------------------+10 rows in set (0.00 sec)
复制代码


发布于: 刚刚阅读数: 4
用户头像

TiDB 社区官网:https://tidb.net/ 2021.12.15 加入

TiDB 社区干货传送门是由 TiDB 社区中布道师组委会自发组织的 TiDB 社区优质内容对外宣布的栏目,旨在加深 TiDBer 之间的交流和学习。一起构建有爱、互助、共创共建的 TiDB 社区 https://tidb.net/

评论

发布
暂无评论
一条 like 条件的慢 SQL 语句优化_管理与运维_TiDB 社区干货传送门_InfoQ写作社区