TiDB 4.0 试玩体验 --Tiflash

2022 年 7 月 11 日
本文字数：1043 字
阅读完需：约 3 分钟

作者： benben 原文来源：https://tidb.net/blog/e97d3ea0

概述

Tiflash 是 4.0 的一个巨大的提升，补充了 Tidb 对 OLAP 的支持。对 OLAP 型的分析而言，列存是最合适的。举例来说, 用户表 User 有 50 个字段，如果想统计用户地域，年龄某些维度的分布情况，那么由于地域信息都一起存在某些 block 中，粗略来算，存储密度是行存的 50 倍。换言之，磁盘 IO 就减少了 50 倍，而在数据库系统中，IO 普遍是最慢的。因此，整个请求的时间，资源消耗就减少至 1/50。而且在列存的组织方式下，压缩效率比行存要高，使得 IO 进一步减少（思考下 why）。让我们开始动手试试看吧。。。

安装 Tiflash

之前的开发环境是 ansible 安装的，官方回复以后会用 tiup 逐步替代 ansible，那就直接用 Tiup。导入升级，然后安装 tiflash，过程非常顺利，有不懂的可以–help 看下 tiup 的具体用法。

tiup cluster import -d /home/tidb/tidb-ansibletiup cluster upgrade develop v4.0.0-rctiup cluster scale-out develop scale-out.yaml

复制代码

安装完成后，用tiup cluster display develop查看下相应的组件是否正确安装及更新。

tiflash 复制

设置复制

alter table dim.dim_worker_info_df set tiflash replica 1;

这里有个疑问，replica>1 的用途是？

删除复制

alter table dim.dim_worker_info_df set tiflash replica 0;

查看复制进度

select * from information_schema.tiflash_replica where table_schema = 'dim' and table_name = 'dim_worker_info_df'

复制代码

其中 progress 是同步进度，0-1,1 代表同步进度 100%，同步已完成。available=1 代表此表可对外提供服务。实测中，600W 的表 3 秒同步完成，还是挺快的。

SQL 测试

tiflash 73ms

set @@session.tidb_isolation_read_engines=‘tikv,tiflash’

explain analyze select a.nation ,count(1) from dim.dim_worker_info_df a group by a.nation

tikv 429ms

set @@session.tidb_isolation_read_engines=‘tikv’

将读引擎设置为 tikv，来强制使用 tikv。 explain analyze select a.nation,count(1) from dim.dim_worker_info_df a group by a.nation

join 中的运用

explain select a.*,b.birth_place from dwd.dwd_worker_attendance_day_df a inner join dim.dim_worker_info_df b on a.worker_id = b.worker_id

tidb 的执行计划会根据 SQL 中具体的列来选择合适的引擎。

总结

由于实测的表数据量不是很大，都进了 tikv/tiflash 节点的 cache 中了，两者相差了 6 倍的性能，差距不大，建议测试表的体量在 1000w 以上，字段 50 左右会更明显。

发布于: 刚刚阅读数: 2

原文链接:【http://xie.infoq.cn/article/dcfaf42d93e3e1dba64a716b4】。文章转载请联系作者。

TiDB 社区干货传送门

关注

TiDB 社区官网:https://tidb.net/ 2021.12.15 加入

TiDB 社区干货传送门是由 TiDB 社区中布道师组委会自发组织的 TiDB 社区优质内容对外宣布的栏目，旨在加深 TiDBer 之间的交流和学习。一起构建有爱、互助、共创共建的 TiDB 社区 https://tidb.net/

发布

暂无评论

创作场景