TiFlash 亿级多表关联优化实践,从无法跑出结果优化到 2.59 秒
作者: 数据源的 TiDB 学习之路原文来源:https://tidb.net/blog/c369b139
前段时间遇到一个 SQL 测试用例,典型的 OLAP 分析查询语句—4 表关联,其中 2 张表数据量为亿级,查询结果集为百万级,希望在 TiDB 中测试一下大致性能预期。由于无法提供真实数据,基于表结构模拟造数测试。基于 SQL 用例,选择 TiFlash 引擎进行测试,从初始情况下无法跑出结果到最终优化到 2.59 秒,本文详细描述一下测试过程及优化经验。
准备工作
测试环境准备
所选择测试环境为三节点的 TiDB 集群,集群资源及组件部署情况如下:
默认参数调整
考虑到需要批量快速模拟造数,以及 SQL 中包括 group_concat 拼接功能,我们需要提前修改以下系统变量。
表结构及造数
测试语句共使用 4 张表,其中 2 张表模拟 1 亿条数据,2 张表模拟 1 千条数据,SQL 语句查询结果集为 100 万条左右。为了简便起见,模拟造数使用 TiDB 兼容的 CTE 递归语句批量生成。同时,为了避免大事务的影响,使用 BATCH 语句将大批量 INSERT 写入内部拆分为多个事务进行。
SQL 语句
SQL 语句为 4 表关联语句,两张亿级表按条件关联,另外两张 1000 行的表使用 group_concat 拼接为 1 行记录后与大表关联结果进行 Json 匹配,整体结果集为百万条记录级别。初始 SQL 语句如下:
SQL 优化过程
初始执行—无法查询结果
使用原始语句执行,等待数小时后,无法执行出结果。
使用 explain 查看执行计划,
执行计划中表现的主要问题包括:
两张亿级别表关联产生笛卡尔积,导致预估处理行数超大
语句执行 warning 信息,提示 group_concat、date_add 未下推
优化大表笛卡尔积问题—执行耗时 08 分 34 秒
上述两张亿级表关联产生了笛卡尔积,主要是因为关联条件使用 t1.vehicle_no = t2.vehicle_no OR t1.vin_no = t2.vehicle_no 导致,猜测可能是优化器针对带 OR 条件关联场景未做特殊优化。基于过往经验,这样的关联语句可以等价改写为两个 SELECT 子句 UNION 的方式,更改如下,
修改后,SQL 语句从无法执行出结果到 8 分钟 34 秒完成查询。新的执行计划中也不再看到巨大的预估行数,说明此等价修改有效。
优化 Json 转换导致 HashJoin 效率问题—执行耗时 19 秒
仔细查看上述语句的 explain analyze 输出,我们发现时间主要消耗在外层的 HashJoin 算子,这个 HashJoin 主要是对两张大表关联的结果集(约 100 万)与两个 Json 字符串(各 1 条记录)做笛卡尔积。关联的结果集大小并没有任何变化,但耗时却长达 8 分钟,这不符合常理,我们怀疑问题出在百万次的 Varchar->Json 类型转换上面。
从上述执行计划看,HashJoin 的 operator info 中需要针对每一行关联的记录做 2 次 cast(.., json BINARY) 动作,因为 t3 和 t4 子查询中的 city_info 和 day_info 是字符类型,这个隐式转换动作不可避免。
因此我们的优化思路是在子查询中提前将拼接的字符串转换为 Json 类型,这样理论上可以规避百万次的 cast(..,json BINARY) 动作,于是修改如下内容,
经过上述修改,执行耗时缩短到 19 秒,从执行计划可以看出,虽然 HashJoin 的 operator info 中仍然能看到 cast(.., json BINARY),但由于对应字段已经提前转换为 Json 类型,所以内部并不需要实际的隐式转换动作,性能得到大幅提升。
优化 date_add 条件下推问题—执行耗时 13 秒
虽然优化到 19 秒,但 warning 信息我们仍然没有解决,我们首先来处理 date_add。跟 date_add 相关的语句为:now() BETWEEN date_add(policy_expire_date, interval -60 day) AND policy_expire_date。TiFlash 支持粗糙索引,针对数值、日期数据类型默认每 8K 行生成 min-max 内建索引,而这里的条件过滤因为把字段放在 date_add 函数中,会导致无法走 min-max 索引的情况,因此需要将过滤条件做如下修改:
按上述修改后,语句执行后虽然不再显示 date_add 的 warning,但是从执行计划中查看过滤条件仍然没有下推执行。
经查看,原因是 policy_expire_date 对应的字段(c_t4.c2)类型为 varchar 类型,导致在语句 policy_expire_date between now() and date_add(now(), interval 60 day) 内部会包含一层隐式类型转换。因此,我们将这个 c2 字段修改为 datetime 类型,为了便于对比,创建一张新表并将数据同步到新表结构中。
现在,SQL 的执行时间被优化到 13 秒,从执行计划中可以看出,条件过滤已经能正常下推。
优化 group_concat 条件下推问题—执行耗时 2.59 秒
解决 date_add 的 warning 后,还剩下一个 group_concat 的 warning。从 TiDB 官方文档可以看出,TiFlash 存储引擎是支持 group_concat 下推的,那么为什么这里 group_concat 没有下推呢?在相关专家的指导下了解到,SQL 示例中的 group_concat 所在的查询语句是一个不带 group by 的聚合查询,TiFlash 当前针对这样的场景尚且还无法下推执行,后续版本会进行改进,详见
虽然 TiFlash 不支持这种情况的下推,但我们可以使用一些绕行方案,比如在不影响结果正确性的情况下通过增加 ORDER BY 或 DISTINCT 可以让查询下推。这里我们通增加 DISTINCT 来进行测试,修改如下
修改后的执行计划如下所示,可以看出除 root 以外所有的执行算子均已下推到 TiFlash 执行,执行耗时也被优化到 2.59 秒。
```markdown+——————————————————————–+————–+———–+————–+—————+——————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————-+—————————————————————————————————————————————————————————————————————————————————————————+———-+——+| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |+——————————————————————–+————–+———–+————–+—————+——————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————-+—————————————————————————————————————————————————————————————————————————————————————————+———-+——+| TableReader_213 | 1842063.40 | 1028305 | root | | time:2.57s, loops:1484, RU:0.000000, cop_task: {num: 801, max: 0s, min: 0s, avg: 0s, p95: 0s, copr_cache_hit_ratio: 0.00} | MppVersion: 2, data:ExchangeSender_212 | 160.6 KB | N/A || └─ExchangeSender_212 | 1842063.40 | 1028305 | mpp[tiflash] | | tiflash_task:{proc max:2.56s, min:898.8ms, avg: 1.83s, p80:2.49s, p95:2.56s, iters:1652, tasks:6, threads:240} | ExchangeType: PassThrough | N/A | N/A || └─Projection_37 | 1842063.40 | 1028305 | mpp[tiflash] | | tiflash_task:{proc max:2.56s, min:898.8ms, avg: 1.82s, p80:2.49s, p95:2.56s, iters:1652, tasks:6, threads:240} | Column#37, Column#38, Column#62 | N/A | N/A || └─Selection_211 | 1842063.40 | 1028305 | mpp[tiflash] | | tiflash_task:{proc max:2.56s, min:898.8ms, avg: 1.82s, p80:2.49s, p95:2.56s, iters:1652, tasks:6, threads:240} | not(isnull(if(or(isnull(cast(Column#63, var_string(16777216))), eq(Column#63, cast(“”, json BINARY))), “0”, cast(Column#63, var_string(16777216))))) | N/A | N/A || └─Projection_208 | 2302579.25 | 1028305 | mpp[tiflash] | | tiflash_task:{proc max:2.55s, min:898.8ms, avg: 1.82s, p80:2.48s, p95:2.55s, iters:1652, tasks:6, threads:240} | Column#37, Column#38, Column#62, ifnull(json_extract(cast(Column#61, json BINARY), concat(.C, cast(Column#62, var_string(16777216)))), json_extract(cast(Column#61, json BINARY), concat(.C, substring(cast(Column#62, var_string(16777216)), 1, 2), 0000)))->Column#63 | N/A | N/A || └─Projection_205 | 2302579.25 | 1028305 | mpp[tiflash] | | tiflash_task:{proc max:2.5s, min:898.8ms, avg: 1.79s, p80:2.42s, p95:2.5s, iters:1652, tasks:6, threads:240} | Column#37, Column#38, ifnull(json_extract(cast(Column#50, json BINARY), concat(., substring(Column#37, 1, 2))), json_extract(cast(Column#50, json BINARY), concat(., substring(Column#37, 1, 1), )))->Column#62, Column#61 | N/A | N/A || └─Projection_202 | 2302579.25 | 1028305 | mpp[tiflash] | | tiflash_task:{proc max:2.44s, min:898.8ms, avg: 1.75s, p80:2.34s, p95:2.44s, iters:1652, tasks:6, threads:240} | Column#37, Column#38, Column#50, Column#61 | N/A | N/A || └─HashJoin_199 | 2302579.25 | 1028305 | mpp[tiflash] | | tiflash_task:{proc max:2.44s, min:898.8ms, avg: 1.75s, p80:2.34s, p95:2.44s, iters:1652, tasks:6, threads:240} | CARTESIAN inner join, other cond:ne(ifnull(cast(ifnull(json_extract(cast(Column#50, json BINARY), concat(“.”, substring(Column#37, 1, 2))), json_extract(cast(Column#50, json BINARY), concat(“.”, substring(Column#37, 1, 1), “”))), var_string(16777216)), “”), “”) | N/A | N/A || ├─ExchangeReceiver_68(Build) | 1.00 | 6 | mpp[tiflash] | | tiflash_task:{proc max:53.8ms, min:36.3ms, avg: 45.5ms, p80:51.8ms, p95:53.8ms, iters:6, tasks:6, threads:240} | | N/A | N/A || │ └─ExchangeSender_67 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:27.4ms, min:0s, avg: 4.56ms, p80:0s, p95:27.4ms, iters:1, tasks:6, threads:1} | ExchangeType: Broadcast, Compression: FAST | N/A | N/A || │ └─HashJoin_49 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:26.4ms, min:0s, avg: 4.4ms, p80:0s, p95:26.4ms, iters:1, tasks:6, threads:1} | CARTESIAN inner join | N/A | N/A || │ ├─ExchangeReceiver_59(Build) | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:24.4ms, min:0s, avg: 4.06ms, p80:0s, p95:24.4ms, iters:1, tasks:6, threads:40} | | N/A | N/A || │ │ └─ExchangeSender_58 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:19.7ms, min:0s, avg: 3.29ms, p80:0s, p95:19.7ms, iters:1, tasks:6, threads:1} | ExchangeType: Broadcast, Compression: FAST | N/A | N/A || │ │ └─Projection_51 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:19.7ms, min:0s, avg: 3.29ms, p80:0s, p95:19.7ms, iters:1, tasks:6, threads:1} | cast(concat({, Column#49, }), json BINARY)->Column#50 | N/A | N/A || │ │ └─Projection_54 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:19.7ms, min:0s, avg: 3.29ms, p80:0s, p95:19.7ms, iters:1, tasks:6, threads:1} | Column#49 | N/A | N/A || │ │ └─HashAgg_55 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:19.7ms, min:0s, avg: 3.29ms, p80:0s, p95:19.7ms, iters:1, tasks:6, threads:1} | funcs:group_concat(distinct Column#92, Column#93, Column#94, Column#95, Column#96 separator “,”)->Column#49 | N/A | N/A || │ │ └─Projection_215 | 1.00 | 1000 | mpp[tiflash] | | tiflash_task:{proc max:17.7ms, min:0s, avg: 2.96ms, p80:0s, p95:17.7ms, iters:1, tasks:6, threads:40} | Column#74->Column#92, Column#75->Column#93, Column#76->Column#94, cast(za.c_t2.c1, var_string(20))->Column#95, Column#77->Column#96 | N/A | N/A || │ │ └─ExchangeReceiver_57 | 1.00 | 1000 | mpp[tiflash] | | tiflash_task:{proc max:17.7ms, min:0s, avg: 2.96ms, p80:0s, p95:17.7ms, iters:1, tasks:6, threads:40} | | N/A | N/A || │ │ └─ExchangeSender_56 | 1.00 | 1000 | mpp[tiflash] | | tiflash_task:{proc max:12.4ms, min:0s, avg: 2.06ms, p80:0s, p95:12.4ms, iters:1, tasks:6, threads:1} | ExchangeType: PassThrough, Compression: FAST | N/A | N/A || │ │ └─HashAgg_52 | 1.00 | 1000 | mpp[tiflash] | | tiflash_task:{proc max:12.4ms, min:0s, avg: 2.06ms, p80:0s, p95:12.4ms, iters:1, tasks:6, threads:1} | group by:“”, “”“, “”:“, Column#90, Column#91, | N/A | N/A || │ │ └─Projection_214 | 1000.00 | 1000 | mpp[tiflash] | | tiflash_task:{proc max:10.4ms, min:0s, avg: 1.73ms, p80:0s, p95:10.4ms, iters:1, tasks:6, threads:40} | replace(za.c_t2.c9, *, _)->Column#90, za.c_t2.c1->Column#91 | N/A | N/A || │ │ └─TableFullScan_53 | 1000.00 | 1000 | mpp[tiflash] | table:c_t2 | tiflash_task:{proc max:10.4ms, min:0s, avg: 1.73ms, p80:0s, p95:10.4ms, iters:1, tasks:6, threads:40}, tiflash_scan:{dtfile:{total_scanned_packs:1, total_skipped_packs:0, total_scanned_rows:1000, total_skipped_rows:0, total_rs_index_check_time: 0ms, total_read_time: 0ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 1, total_remote_region_num: 0, total_learner_read_time: 1ms} | keep order:false | N/A | N/A || │ └─Projection_60(Probe) | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:25.4ms, min:0s, avg: 4.23ms, p80:0s, p95:25.4ms, iters:1, tasks:6, threads:1} | cast(concat({, Column#60, }), json BINARY)->Column#61 | N/A | N/A || │ └─Projection_63 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:25.4ms, min:0s, avg: 4.23ms, p80:0s, p95:25.4ms, iters:1, tasks:6, threads:1} | Column#60 | N/A | N/A || │ └─HashAgg_64 | 1.00 | 1 | mpp[tiflash] | | tiflash_task:{proc max:25.4ms, min:0s, avg: 4.23ms, p80:0s, p95:25.4ms, iters:1, tasks:6, threads:1} | funcs:group_concat(distinct Column#99, Column#100, Column#101, Column#102, Column#103 separator “,”)->Column#60 | N/A | N/A || │ └─Projection_217 | 1.00 | 1001 | mpp[tiflash] | | tiflash_task:{proc max:19.4ms, min:0s, avg: 3.23ms, p80:0s, p95:19.4ms, iters:1, tasks:6, threads:40} | Column#78->Column#99, Column#79->Column#100, Column#80->Column#101, cast(za.c_t3.c2, var_string(20))->Column#102, Column#81->Column#103 | N/A | N/A || │ └─ExchangeReceiver_66 | 1.00 | 1001 | mpp[tiflash] | | tiflash_task:{proc max:19.4ms, min:0s, avg: 3.23ms, p80:0s, p95:19.4ms, iters:1, tasks:6, threads:40} | | N/A | N/A || │ └─ExchangeSender_65 | 1.00 | 1001 | mpp[tiflash] | | tiflash_task:{proc max:13.9ms, min:0s, avg: 2.32ms, p80:0s, p95:13.9ms, iters:1, tasks:6, threads:1} | ExchangeType: PassThrough, Compression: FAST | N/A | N/A || │ └─HashAgg_61 | 1.00 | 1001 | mpp[tiflash] | | tiflash_task:{proc max:13.9ms, min:0s, avg: 2.32ms, p80:0s, p95:13.9ms, iters:1, tasks:6, threads:1} | group by:” “”, “”, “”: “, Column#97, Column#98, | N/A | N/A || │ └─Projection_216 | 1001.00 | 1001 | mpp[tiflash] | | tiflash_task:{proc max:11.9ms, min:0s, avg: 1.99ms, p80:0s, p95:11.9ms, iters:1, tasks:6, threads:40} | concat(C, za.c_t3.c1)->Column#97, za.c_t3.c2->Column#98 | N/A | N/A || │ └─TableFullScan_62 | 1001.00 | 1001 | mpp[tiflash] | table:c_t3 | tiflash_task:{proc max:11.9ms, min:0s, avg: 1.99ms, p80:0s, p95:11.9ms, iters:1, tasks:6, threads:40}, tiflash_scan:{dtfile:{total_scanned_packs:1, total_skipped_packs:0, total_scanned_rows:1001, total_skipped_rows:0, total_rs_index_check_time: 0ms, total_read_time: 0ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 1, total_remote_region_num: 0, total_learner_read_time: 1ms} | keep order:false | N/A | N/A || └─Union_69(Probe) | 2302579.25 | 0 | mpp[tiflash] | | tiflash_task:{proc max:0s, min:0s, avg: 0s, p80:0s, p95:0s, iters:0, tasks:6, threads:0} | | N/A | N/A || ├─Projection_70 | 1284216.97 | 1500895 | mpp[tiflash] | | tiflash_task:{proc max:1.7s, min:0s, avg: 742.1ms, p80:1.41s, p95:1.7s, iters:1652, tasks:6, threads:120} | cast(za.c_t4_dt.c1, varchar(32) CHARACTER SET utf8 COLLATE utf8_bin)->Column#37, za.c_t4_dt.c2->Column#38 | N/A | N/A || │ └─Projection_71 | 1284216.97 | 1500895 | mpp[tiflash] | | tiflash_task:{proc max:1.7s, min:0s, avg: 741.1ms, p80:1.41s, p95:1.7s, iters:1652, tasks:6, threads:120} | za.c_t4_dt.c1, za.c_t4_dt.c2 | N/A | N/A || │ └─HashJoin_72 | 1284216.97 | 1500895 | mpp[tiflash] | | tiflash_task:{proc max:1.7s, min:0s, avg: 741.1ms, p80:1.41s, p95:1.7s, iters:1652, tasks:6, threads:120} | inner join, equal:[eq(za.c_t4_dt.c1, za.c_t1.c1)] | N/A | N/A || │ ├─ExchangeReceiver_76(Build) | 1003426.73 | 3002814 | mpp[tiflash] | | tiflash_task:{proc max:200.6ms, min:0s, avg: 97.7ms, p80:199ms, p95:200.6ms, iters:315, tasks:6, threads:120} | | N/A | N/A || │ │ └─ExchangeSender_75 | 1003426.73 | 1000938 | mpp[tiflash] | | tiflash_task:{proc max:198.7ms, min:0s, avg: 33.1ms, p80:0s, p95:198.7ms, iters:1604, tasks:6, threads:120} | ExchangeType: Broadcast, Compression: FAST | N/A | N/A || │ │ └─TableFullScan_73 | 1003426.73 | 1000938 | mpp[tiflash] | table:c_t4_dt | tiflash_task:{proc max:195.7ms, min:0s, avg: 32.6ms, p80:0s, p95:195.7ms, iters:1604, tasks:6, threads:120}, tiflash_scan:{dtfile:{total_scanned_packs:24768, total_skipped_packs:1747, total_scanned_rows:200037563, total_skipped_rows:14029436, total_rs_index_check_time: 41ms, total_read_time: 6185ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 201, total_remote_region_num: 0, total_learner_read_time: 32ms} | pushed down filter:ge(za.c_t4_dt.c2, 2024-02-03 11:28:25), le(za.c_t4_dt.c2, 2024-04-03 11:28:25), keep order:false | N/A | N/A || │ └─Selection_78(Probe) | 100000000.00 | 100000000 | mpp[tiflash] | | tiflash_task:{proc max:1.37s, min:0s, avg: 490ms, p80:800.3ms, p95:1.37s, iters:1652, tasks:6, threads:120} | not(isnull(za.c_t1.c1)) | N/A | N/A || │ └─TableFullScan_77 | 100000000.00 | 100000000 | mpp[tiflash] | table:c_t1 | tiflash_task:{proc max:1.37s, min:0s, avg: 487.5ms, p80:786.3ms, p95:1.37s, iters:1652, tasks:6, threads:120}, tiflash_scan:{dtfile:{total_scanned_packs:12261, total_skipped_packs:47, total_scanned_rows:100000000, total_skipped_rows:375339, total_rs_index_check_time: 2ms, total_read_time: 5001ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 98, total_remote_region_num: 0, total_learner_read_time: 38ms} | pushed down filter:empty, keep order:false | N/A | N/A || └─Projection_80 | 1018362.28 | 0 | mpp[tiflash] | | tiflash_task:{proc max:1.69s, min:0s, avg: 583.8ms, p80:913.8ms, p95:1.69s, iters:1651, tasks:6, threads:120} | za.c_t1.c1->Column#37, za.c_t4_dt.c2->Column#38 | N/A | N/A || └─HashJoin_81 | 1018362.28 | 0 | mpp[tiflash] | | tiflash_task:{proc max:1.69s, min:0s, avg: 583.8ms, p80:913.8ms, p95:1.69s, iters:1651, tasks:6, threads:120} | inner join, equal:[eq(za.c_t4_dt.c1, za.c_t1.c2)] | N/A | N/A || ├─ExchangeReceiver_85(Build) | 1003426.73 | 3002814 | mpp[tiflash] | | tiflash_task:{proc max:194.8ms, min:0s, avg: 96.7ms, p80:192.8ms, p95:194.8ms, iters:294, tasks:6, threads:120} |
版权声明: 本文为 InfoQ 作者【TiDB 社区干货传送门】的原创文章。
原文链接:【http://xie.infoq.cn/article/854a011c38d6d41e3a8ff78b4】。文章转载请联系作者。
评论