TiDB SQL 优化案例几则
- 2022 年 7 月 11 日
本文字数:13508 字
阅读完需:约 44 分钟
作者: mydb 原文来源:https://tidb.net/blog/b0222e7f
1. 概述
TiDB 日常运维过程中,很多时候集群抖动都是一些慢 SQL 导致的,本文抛砖引玉,分享几则线上 SQL 优化案例,为大家日常 SQL 优化提供一些方法和思路。
2. 优化案例 1
原始 SQL
问题
这个表有 13 亿多数据,表上有索引 (rTopicId,replyId) 和 (replyId ),而且区分度还不错,本条 SQL 返回数据是 0 行。但是,实际上这个 SQL 执行时间长达 800 多秒,扫描 keys 高达 13 亿,几十个这种 SQL 请求可能就会将集群拖垮,部分慢日志内容如下
# Query_time: 831.631793825
# Process_time: 2980.99 Wait_time: 9430.584 Request_count: 15455 Total_keys: 1316893603 Process_keys: 1312707991
优化方法
方法 1
将上面一条 SQL 拆分成下面两条 SQL,这两条 SQL 分别执行的效率非常高,毫秒级返回数据。
SELECT rTargetReplyid FROM table_xxx WHERE replyId = 1319646872;
SELECT rDelete FROM table_xxx WHERE rTopicId = 90168191 AND replyId = xxx;
方法 2
将上面子查询 SQL 改写成 join 查询,改写后也是毫秒级级返回数据。
select
t.rDelete
from
table_xxx t,
(
select
rTargetReplyid
from
table_xxx
where
replyId = 1319646872
) tmp_t
where
t.rTopicId = 90168191
and t.replyId = tmp_t.rTargetReplyid;
方法 3
继续用子查询,这里只是将 rTopicId 字段也加入子查询
SELECT
rDelete
FROM
table_xxx
WHERE
rTopicId = 90168191
AND replyId = (
SELECT
rTargetReplyid
FROM
table_xxx
WHERE
rTopicId = 90168191
AND replyId = 1319646872
);
从性能方面来讲:
第一种方式 > 第二种方式 > 第三种方式
建议使用第一种方式,SQL 简单高效,最终我们线上也是采用的第一种方式,效果非常好。
优化后的效果及总结
优化前 SQL 执行时间 800 多秒,优化后毫秒级返回数据。我们平时书写 SQL 时应当尽量避免子查询,防止子查询带来的一些问题。
3. 优化案例 2
原始 SQL
问题
完全相同的 SQL,limit 10000 可以正常返回结果,limit 20000 也可以正常返回结果,但是 limit 30000 时无法返回结果
优化方法
调大 TiDB 系统参数,增大并发,主要调整的参数如下
set global tidb_index_lookup_concurrency=8; #默认值4
set global tidb_index_lookup_join_concurrency=8; #默认值4
set global tidb_index_join_batch_size=50000; #默认值25000
下面是这几个参数的说明,更详细的信息请见官网
tidb_index_lookup_concurrency
作用域:SESSION | GLOBAL
默认值:4
这个变量用来设置 index lookup 操作的并发度,AP 类应用适合较大的值,TP 类应用适合较小的值。
tidb_index_join_batch_size
作用域:SESSION | GLOBAL
默认值:25000
这个变量用来设置 index lookup join 操作的 batch 大小,AP 类应用适合较大的值,TP 类应用适合较小的值。
tidb_index_lookup_join_concurrency
作用域:SESSION | GLOBAL
默认值:4
这个变量用来设置 index lookup join 算法的并发度。
优化后的效果及总结
优化 TiDB 参数后,SQL(limit 30000) 可以正常返回结果,响应时间有小幅提升。以上几个参数对于 OLAP 的系统,可以根据实际情况进行调整优化。
4. 优化案例 3
原始 SQL
问题
a 表上有一个联合索引 (state,biz_type,biz_id),通过这个索引 a 表只返回 1 条数据,效率很高,e 表上有一个唯一索引 object_uid,按理来说效率应该非常高才对。但是这个 SQL 响应时间 250 多秒,明显效率太低,不合常理。
优化方法
通过执行计划看到此 SQL 走了 tiflash,执行时间是 256 秒,我们加一下 hint,让其走 tikv 看下执行计划和效率
desc SELECT /*+ read_from_storage(tikv[a],tikv[e]) */
a.biz_id,
a.biz_type,
a.short_title,
a.img_url,
a.author,
a.graphic_img_list3,
a.jump_url,
a.cms_series_ids,
a.car_brand_ids,
a.cms_level_ids,
a.cms_spec_ids,
a.cms_tags,
a.nlp_tags_choose2,
a.city_ids,
a.cms_content_class,
a.recommend_time,
a.created_stime AS create_at,
a.modified_stime,
e.uniq_brands_id,
e.uniq_brands_name,
e.uniq_category_id,
e.uniq_category_name,
e.uniq_city_id,
e.uniq_city_name,
e.uniq_keywords_id,
e.uniq_keywords_name,
e.uniq_nlp_is_accident,
e.uniq_nlp_is_beauty,
e.uniq_series_id,
e.uniq_series_name
FROM
tb1 a
LEFT JOIN tb2 e ON CONCAT(a.biz_type, '-2014-', a.biz_id) = e.object_uid
where
a.state = 0
and a.biz_type = 600038
and a.biz_id in (1087200);
+-------------------------------+----------+-----------+-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+----------+-----------+-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_6 | 8.00 | root | | test.tb1.biz_id, test.tb1.biz_type, test.tb1.short_title, test.tb1.img_url, test.tb1.author, test.tb1.graphic_img_list3, test.tb1.jump_url, test.tb1.cms_series_ids, test.tb1.car_brand_ids, test.tb1.cms_level_ids, test.tb1.cms_spec_ids, test.tb1.cms_tags, test.tb1.nlp_tags_choose2, test.tb1.city_ids, test.tb1.cms_content_class, test.tb1.recommend_time, test.tb1.created_stime, test.tb1.modified_stime, test.tb2.uniq_brands_id, test.tb2.uniq_brands_name, test.tb2.uniq_category_id, test.tb2.uniq_category_name, test.tb2.uniq_city_id, test.tb2.uniq_city_name, test.tb2.uniq_keywords_id, test.tb2.uniq_keywords_name, test.tb2.uniq_nlp_is_accident, test.tb2.uniq_nlp_is_beauty, test.tb2.uniq_series_id, test.tb2.uniq_series_name |
| └─HashJoin_8 | 8.00 | root | | CARTESIAN left outer join |
| ├─Selection_10(Build) | 0.00 | root | | eq(test.tb1.state, 0) |
| │ └─Point_Get_9 | 1.00 | root | table:tb1, index:uniq_index_id_type(biz_id, biz_type) | |
| └─TableReader_13(Probe) | 8000.00 | root | | data:Selection_12 |
| └─Selection_12 | 8000.00 | cop[tikv] | | eq(concat(cast(600038), "-2014-", cast(1087200)), test.tb2.object_uid) |
| └─TableFullScan_11 | 10000.00 | cop[tikv] | table:e | keep order:false, stats:pseudo |
+-------------------------------+----------+-----------+-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.00 sec)
走 TiKV 的响应时间是 21 秒,相比走 TiFlash 的 250 多秒快了不少。通过执行计划看到,e 表并没有走唯一索引,而是进行了全表扫描,这是为什么?PingCAP 官方小伙伴振娇一起帮忙分析了这个 SQL,原因是:因为 CONCAT(a.biz_type,’-2014-’,a.biz_id) 这个拼接,导致 tidb 没有生成 a 这个列的相应的虚拟列 ,所以 e 表即使有唯一索引,也没有办法用上 (优化器这块有提升的空间)
下面我们尝试通过改写 SQL 的方法来看下,改写后的 SQL 如下
desc select a1.*,
e.uniq_brands_id,
e.uniq_brands_name,
e.uniq_category_id,
e.uniq_category_name,
e.uniq_city_id,
e.uniq_city_name,
e.uniq_keywords_id,
e.uniq_keywords_name,
e.uniq_nlp_is_accident,
e.uniq_nlp_is_beauty,
e.uniq_series_id,
e.uniq_series_name
from
(select
CONCAT(a.biz_type,'-2014-',a.biz_id) as c1,
a.biz_type,
a.short_title,
a.img_url,
a.author,
a.graphic_img_list3,
a.jump_url,
a.cms_series_ids,
a.car_brand_ids,
a.cms_level_ids,
a.cms_spec_ids,
a.cms_tags,
a.nlp_tags_choose2,
a.city_ids,
a.cms_content_class,
a.recommend_time,
a.created_stime AS create_at,
a.modified_stime
from tb1 a
where a.state = 0 and a.biz_type = 600038 and a.biz_id in (1087200)
) a1
left join tb2 e on a1.c1 = e.object_uid;
+-----------------------------------+---------+-----------+-------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------------+---------+-----------+-------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_7 | 0.00 | root | | Column#106, test.tb1.biz_type, test.tb1.short_title, test.tb1.img_url, test.tb1.author, test.tb1.graphic_img_list3, test.tb1.jump_url, test.tb1.cms_series_ids, test.tb1.car_brand_ids, test.tb1.cms_level_ids, test.tb1.cms_spec_ids, test.tb1.cms_tags, test.tb1.nlp_tags_choose2, test.tb1.city_ids, test.tb1.cms_content_class, test.tb1.recommend_time, test.tb1.created_stime, test.tb1.modified_stime, test.tb2.uniq_brands_id, test.tb2.uniq_brands_name, test.tb2.uniq_category_id, test.tb2.uniq_category_name, test.tb2.uniq_city_id, test.tb2.uniq_city_name, test.tb2.uniq_keywords_id, test.tb2.uniq_keywords_name, test.tb2.uniq_nlp_is_accident, test.tb2.uniq_nlp_is_beauty, test.tb2.uniq_series_id, test.tb2.uniq_series_name |
| └─IndexJoin_11 | 0.00 | root | | left outer join, inner:IndexLookUp_10, outer key:Column#106, inner key:test.tb2.object_uid, equal cond:eq(Column#106, test.tb2.object_uid) |
| ├─Projection_16(Build) | 0.00 | root | | concat(cast(test.tb1.biz_type, var_string(20)), -2014-, cast(test.tb1.biz_id, var_string(20)))->Column#106, test.tb1.biz_type, test.tb1.short_title, test.tb1.img_url, test.tb1.author, test.tb1.graphic_img_list3, test.tb1.jump_url, test.tb1.cms_series_ids, test.tb1.car_brand_ids, test.tb1.cms_level_ids, test.tb1.cms_spec_ids, test.tb1.cms_tags, test.tb1.nlp_tags_choose2, test.tb1.city_ids, test.tb1.cms_content_class, test.tb1.recommend_time, test.tb1.created_stime, test.tb1.modified_stime |
| │ └─Selection_18 | 0.00 | root | | eq(test.tb1.state, 0) |
| │ └─Point_Get_17 | 1.00 | root | table:tb1, index:uniq_index_id_type(biz_id, biz_type) | |
| └─IndexLookUp_10(Probe) | 1.00 | root | | |
| ├─IndexRangeScan_8(Build) | 1.00 | cop[tikv] | table:e, index:uniq_object_uid(object_uid) | range: decided by [eq(test.tb2.object_uid, Column#106)], keep order:false, stats:pseudo |
| └─TableRowIDScan_9(Probe) | 1.00 | cop[tikv] | table:e | keep order:false, stats:pseudo |
+-----------------------------------+---------+-----------+-------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.01 sec)
优化后的效果及总结
优化前 SQL 执行时间 250 多秒,改写后的 SQL 运行时间 0.14 秒。同时,官方也在积极改进这个问题,详情请见https://github.com/pingcap/tidb/issues/22739
5. 优化案例 4
问题 SQL
问题
表中有一个联合索引 (iscompleted,retrytimefornext),而且针对这个 SQL,索引过滤性非常好,只有 1 条数据,但是却走了全表扫描,导致效率低。
优化方法
因为这个表是从 mysql 迁移过来的,在 mysql 上是可以走索引的,而且效率很高,因此猜测可能是 iscompleted 的类型导致的问题。iscompleted 是 bit 类型,尝试在测试环境改为 tinyint 类型,看下执行计划和效果
test > desc select * from tb1_tinyint where iscompleted=0 order by retrytimefornext ;
+--------------------------------+---------+-----------+------------------------------------------------------------------------------------------+------------------------------+
| id | estRows | task | access object | operator info |
+--------------------------------+---------+-----------+------------------------------------------------------------------------------------------+------------------------------+
| IndexLookUp_20 | 32.80 | root | | |
| ├─IndexRangeScan_18(Build) | 32.80 | cop[tikv] | table:tb1_tinyint, index:idx_iscompleted_retrytimefornext(iscompleted, retrytimefornext) | range:[0,0], keep order:true |
| └─TableRowIDScan_19(Probe) | 32.80 | cop[tikv] | table:tb1_tinyint | keep order:false |
+--------------------------------+---------+-----------+------------------------------------------------------------------------------------------+------------------------------+
从上面执行计划可以看到,修改类型后,SQL 走了联合索引 (iscompleted,retrytimefornext)
优化后的效果及总结
优化前执行时间 2.46 秒,优化后执行时间 0.01 秒。tidb 对 bit 类型支持不是太友好,建议在 tidb 里面使用 tinyint 代替 bit 类型。
6. 总结
上面分享的几则优化案例,原因即有 SQL 写法问题,也有 TiDB 优化器的 Bug。对于每一个 SQL,上线前建议业务方尽量审核,查看执行效率,尤其大表,否则上线后可能会造成比较严重的集群抖动。同时了解到,TiDB 官方会在不久的将来提供改写 SQL 的特性 (query rewrite),这样在遇到慢 SQL 导致的集群抖动时,在业务方允许的情况下可以将慢 SQL 及时改写 SQL 返回,比如改写为 select ‘’。
在这里为官方响应问题的速度和 PingCAP 小伙伴的大力支持点赞。TiDB 很优秀也很强大,在迅速发展的过程中难免有一些小的瑕疵,不过瑕不掩瑜,掩盖不了 TiDB 的优秀。
版权声明: 本文为 InfoQ 作者【TiDB 社区干货传送门】的原创文章。
原文链接:【http://xie.infoq.cn/article/7315ac02902ea94c995b38788】。文章转载请联系作者。
TiDB 社区干货传送门
TiDB 社区官网:https://tidb.net/ 2021.12.15 加入
TiDB 社区干货传送门是由 TiDB 社区中布道师组委会自发组织的 TiDB 社区优质内容对外宣布的栏目,旨在加深 TiDBer 之间的交流和学习。一起构建有爱、互助、共创共建的 TiDB 社区 https://tidb.net/
评论