使用 TPC-H 进行 GreatSQL 并行查询测试
作者:GreatSQL
- 2023-05-04 福建
本文字数:7928 字
阅读完需:约 26 分钟
准备工作
数据库版本
生成数据
使用 TPC-H 生成数据
#TPC-H Population Generator (Version 3.0.0)
#生成10G的数据$ ./dbgen -vf -s 10
复制代码
修改 my.cnf
vim /etc/my.cnf
#设置IPB为8Ginnodb_buffer_pool_size = 8G
#设置并行查询的使用最大内存(此处为8G,根据具体配置设置)parallel_memory_limit= 8G
#打开并行查询force_parallel_execute=1
#设置双1(方便导入数据)innodb_flush_log_at_trx_commit = 1 sync_binlog = 1
#关闭binlogskip-log_bin
datadir = /data/GreatSQLsocket = mysql.sock
复制代码
启动数据库后,可以检查配置是否生效
mysql> show variables like '%double%';mysql> show variables like 'log_bin';mysql> show variables like 'sync_binlog';mysql> show variables like 'innodb_flush_log_at_trx_commit';mysql> show variables like 'innodb_buffer_pool_size';
复制代码
并行查询相关参数
mysql> show global variables like '%parall%';+----------------------------------+----------------+| force_parallel_execute | ON || innodb_parallel_dblwr_encrypt | OFF || innodb_parallel_doublewrite_path | xb_doublewrite || innodb_parallel_read_threads | 4 || parallel_cost_threshold | 1000 || parallel_default_dop | 4 || parallel_max_threads | 64 || parallel_memory_limit | 8589934592 || parallel_queue_timeout | 0 || slave_parallel_type | LOGICAL_CLOCK || slave_parallel_workers | 2 |+----------------------------------+----------------+11 rows in set (0.01 sec)
复制代码
启动数据库
启动数据库:
$ systemctl start greatsql.service
复制代码
文件准备
本次的工作在/data/tpch
可执行程序为 dbgen,依赖一个数据分布文件 dists.dss。可以将 dbgen 和 dists.dss 拷贝到同一目录使用
dss.ddl 和 dss.ri 文件
准备表结构和索引文件 dss.ddl 和 dss.ri 到工作目录
$ cd /data/tpch/tpch_2.18.0/dbgen$ cp dss.ri /data/tpch/$ cp dss.ddl /data/tpch/
复制代码
load.sql
修改 load.sql 文件 修改文件的路径
$ cd /data/tpch/$ cp load.sql loadfix.sql $ vim loadfix.sql
复制代码
导入数据
$ /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock
#创建数据库mysql> create database tpch;mysql> use tpch;
#导入表结构mysql> source /data/tpch/dss.ddl;
#导入数据(文件见附录)mysql> sh loadfile
#导入索引、外键等mysql> source /data/tpch/dssfix.ri
复制代码
注:binlog 要关再导入,否则 binlog 会爆
Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage; increase this mysqld variable and try again
复制代码
查看导入的表(20G 数据)
mysql> select table_name,table_rows from information_schema.tables where table_name in ('customer','lineitem','nation','orders','part','partsupp','region','supplier');
+------------+------------+| TABLE_NAME | TABLE_ROWS |+------------+------------+| region | 5 || nation | 25 || part | 3860136 || supplier | 197853 || customer | 2884322 || partsupp | 17084176 || orders | 29678499 || lineitem | 87786966 |+------------+------------+8 rows in set (0.00 sec)
复制代码
编写并运行测试脚本
#测试脚本见附录#这里是在tmux中运行,避免因为终端关闭导致测试终止$ sh auto.sh&
#测试结束后在当前脚本的目录查看生成的日志$ cat tpch-PQ-******.log
复制代码
测试运行时,观察相关指标。
mysql> show global status like '%PQ%';+--------------------+-------+| Variable_name | Value |+--------------------+-------+| PQ_memory_refused | 0 || PQ_memory_used | 0 || PQ_threads_refused | 0 || PQ_threads_running | 0 |+--------------------+-------+4 rows in set (0.00 sec)
mysql> show processlist;mysql> explain for connection **;
复制代码
测试结果
开启并行查询(16 线程)的执行时间,与不开启并行查询的执行时间如下:
注:本文章重点讲测试过程,具体的测试结果就不展开了。
****
附录-相关文件
导入脚本
$ cat loadfile/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/region.tbl' into table region FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/nation.tbl' into table nation FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/supplier.tbl' into table supplier FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/part.tbl' into table part FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/customer.tbl' into table customer FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/partsupp.tbl' into table partsupp FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/orders.tbl' into table orders FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/lineitem.tbl' into table lineitem FIELDS TERMINATED BY '|';" tpch &
复制代码
测试脚本:
脚本是东拼西凑的,写的不好,希望有大佬能指导一下
$ cat auto.sh #include <iostream>TH=$PATH:/usr/local/binexport PATH#set -u#set -x#set -e. ~/.bash_profile > /dev/null 2>&1exec 3>&1 4>&2 1>> tpch-PQ-`date +'%Y%m%d%H%M%S'`.log 2>&1
# 定义要执行的SQL文件存放的目录SQL_DIR="/data/tpch/SQLs"
# 判断目录是否存在if [ ! -d "$SQL_DIR" ]; then echo "SQL文件目录不存在!" exit 1fi
# 进入SQL文件目录cd $SQL_DIR
I=1II=3while [ $I -le $II ]do # 执行SQL文件 for file in `ls *.sql` do echo "正在执行:$file" time /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock -Dtpch < $file echo "SQL:$file,执行完成" echo -e echo "休息100s" sleep 100 echo -e done echo "第$I次循环执行完成!"I=`expr $I + 1`doneecho "脚本结束"
复制代码
dss.ddl
-- Sccsid: @(#)dss.ddl 2.1.8.1drop database tpch;create database tpch;use tpch;CREATE TABLE NATION ( N_NATIONKEY INTEGER NOT NULL, N_NAME CHAR(25) NOT NULL, N_REGIONKEY INTEGER NOT NULL, N_COMMENT VARCHAR(152));
CREATE TABLE REGION ( R_REGIONKEY INTEGER NOT NULL, R_NAME CHAR(25) NOT NULL, R_COMMENT VARCHAR(152));
CREATE TABLE PART ( P_PARTKEY INTEGER NOT NULL, P_NAME VARCHAR(55) NOT NULL, P_MFGR CHAR(25) NOT NULL, P_BRAND CHAR(10) NOT NULL, P_TYPE VARCHAR(25) NOT NULL, P_SIZE INTEGER NOT NULL, P_CONTAINER CHAR(10) NOT NULL, P_RETAILPRICE DECIMAL(15,2) NOT NULL, P_COMMENT VARCHAR(23) NOT NULL );
CREATE TABLE SUPPLIER ( S_SUPPKEY INTEGER NOT NULL, S_NAME CHAR(25) NOT NULL, S_ADDRESS VARCHAR(40) NOT NULL, S_NATIONKEY INTEGER NOT NULL, S_PHONE CHAR(15) NOT NULL, S_ACCTBAL DECIMAL(15,2) NOT NULL, S_COMMENT VARCHAR(101) NOT NULL);
CREATE TABLE PARTSUPP ( PS_PARTKEY INTEGER NOT NULL, PS_SUPPKEY INTEGER NOT NULL, PS_AVAILQTY INTEGER NOT NULL, PS_SUPPLYCOST DECIMAL(15,2) NOT NULL, PS_COMMENT VARCHAR(199) NOT NULL );
CREATE TABLE CUSTOMER ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) NOT NULL, C_ADDRESS VARCHAR(40) NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) NOT NULL, C_COMMENT VARCHAR(117) NOT NULL);
CREATE TABLE ORDERS ( O_ORDERKEY INTEGER NOT NULL, O_CUSTKEY INTEGER NOT NULL, O_ORDERSTATUS CHAR(1) NOT NULL, O_TOTALPRICE DECIMAL(15,2) NOT NULL, O_ORDERDATE DATE NOT NULL, O_ORDERPRIORITY CHAR(15) NOT NULL, O_CLERK CHAR(15) NOT NULL, O_SHIPPRIORITY INTEGER NOT NULL, O_COMMENT VARCHAR(79) NOT NULL);
CREATE TABLE LINEITEM ( L_ORDERKEY INTEGER NOT NULL, L_PARTKEY INTEGER NOT NULL, L_SUPPKEY INTEGER NOT NULL, L_LINENUMBER INTEGER NOT NULL, L_QUANTITY DECIMAL(15,2) NOT NULL, L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL, L_DISCOUNT DECIMAL(15,2) NOT NULL, L_TAX DECIMAL(15,2) NOT NULL, L_RETURNFLAG CHAR(1) NOT NULL, L_LINESTATUS CHAR(1) NOT NULL, L_SHIPDATE DATE NOT NULL, L_COMMITDATE DATE NOT NULL, L_RECEIPTDATE DATE NOT NULL, L_SHIPINSTRUCT CHAR(25) NOT NULL, L_SHIPMODE CHAR(10) NOT NULL, L_COMMENT VARCHAR(44) NOT NULL);
复制代码
dss.ri
-- Sccsid: @(#)dss.ri 2.1.8.1-- tpch Benchmark Version 8.0
-- For table REGIONALTER TABLE tpch.REGIONADD PRIMARY KEY (R_REGIONKEY);
-- For table NATIONALTER TABLE tpch.NATIONADD PRIMARY KEY (N_NATIONKEY);
ALTER TABLE tpch.NATIONADD FOREIGN KEY NATION_FK1 (N_REGIONKEY) references tpch.REGION(R_REGIONKEY);
COMMIT WORK;
-- For table PARTALTER TABLE tpch.PARTADD PRIMARY KEY (P_PARTKEY);
COMMIT WORK;
-- For table SUPPLIERALTER TABLE tpch.SUPPLIERADD PRIMARY KEY (S_SUPPKEY);
ALTER TABLE tpch.SUPPLIERADD FOREIGN KEY SUPPLIER_FK1 (S_NATIONKEY) references tpch.NATION(N_NATIONKEY);
COMMIT WORK;
-- For table PARTSUPPALTER TABLE tpch.PARTSUPPADD PRIMARY KEY (PS_PARTKEY,PS_SUPPKEY);
COMMIT WORK;
-- For table CUSTOMERALTER TABLE tpch.CUSTOMERADD PRIMARY KEY (C_CUSTKEY);
ALTER TABLE tpch.CUSTOMERADD FOREIGN KEY CUSTOMER_FK1 (C_NATIONKEY) references tpch.NATION(N_NATIONKEY);
COMMIT WORK;
-- For table LINEITEMALTER TABLE tpch.LINEITEMADD PRIMARY KEY (L_ORDERKEY,L_LINENUMBER);
COMMIT WORK;
-- For table ORDERSALTER TABLE tpch.ORDERSADD PRIMARY KEY (O_ORDERKEY);
COMMIT WORK;
-- For table PARTSUPPALTER TABLE tpch.PARTSUPPADD FOREIGN KEY PARTSUPP_FK1 (PS_SUPPKEY) references tpch.SUPPLIER(S_SUPPKEY);
COMMIT WORK;
ALTER TABLE tpch.PARTSUPPADD FOREIGN KEY PARTSUPP_FK2 (PS_PARTKEY) references tpch.PART(P_PARTKEY);
COMMIT WORK;
-- For table ORDERSALTER TABLE tpch.ORDERSADD FOREIGN KEY ORDERS_FK1 (O_CUSTKEY) references tpch.CUSTOMER(C_CUSTKEY);
COMMIT WORK;
-- For table LINEITEMALTER TABLE tpch.LINEITEMADD FOREIGN KEY LINEITEM_FK1 (L_ORDERKEY) references tpch.ORDERS(O_ORDERKEY);
COMMIT WORK;
ALTER TABLE tpch.LINEITEMADD FOREIGN KEY LINEITEM_FK2 (L_PARTKEY,L_SUPPKEY) references tpch.PARTSUPP(PS_PARTKEY,PS_SUPPKEY);
COMMIT WORK;
复制代码
SQL 语句
--SQL1 select /*+ PQ(16) */ l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_orderfrom lineitemwhere l_shipdate <= date '1998-12-01' - interval '88' daygroup by l_returnflag, l_linestatusorder by l_returnflag, l_linestatus limit 1;
--SQL3select /*+ PQ(16) */ l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriorityfrom customer, orders, lineitemwhere c_mktsegment = 'MACHINERY' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date '1995-03-01' and l_shipdate > date '1995-03-01'group by l_orderkey, o_orderdate, o_shippriorityorder by revenue desc, o_orderdate limit 10;
--SQL6select /*+ PQ(16) */ sum(l_extendedprice * l_discount) as revenuefrom lineitemwhere l_shipdate >= date '1993-01-01' and l_shipdate < date '1993-01-01' + interval '1' year and l_discount between 0.02 - 0.01 and 0.02 + 0.01 and l_quantity < 24 limit 1;
--SQL10select /*+ PQ(16) */ c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_commentfrom customer, orders, lineitem, nationwhere c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= date '1994-05-01' and o_orderdate < date '1994-05-01' + interval '3' month and l_returnflag = 'R' and c_nationkey = n_nationkeygroup by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_commentorder by revenue desc limit 20;
--SQL12select /*+ PQ(16) */ l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0 end) as low_line_countfrom orders, lineitemwhere o_orderkey = l_orderkey and l_shipmode in ('TRUCK', 'FOB') and l_commitdate < l_receiptdate and l_shipdate < l_commitdate and l_receiptdate >= date '1996-01-01' and l_receiptdate < date '1996-01-01' + interval '1' yeargroup by l_shipmodeorder by l_shipmode limit 1 ;
--SQL19select /*+ PQ(16) */ sum(l_extendedprice* (1 - l_discount)) as revenuefrom lineitem, partwhere ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#22' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 15 and l_quantity <= 15 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#43' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 22 and l_quantity <= 22 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) limit 1 ;
复制代码
相关问题
参考资料
https://imysql.com/2012/12/21/tpch-for-mysql-manual.html
划线
评论
复制
发布于: 刚刚阅读数: 2
版权声明: 本文为 InfoQ 作者【GreatSQL】的原创文章。
原文链接:【http://xie.infoq.cn/article/02e5be27747dc605ae3733a83】。未经作者许可,禁止转载。
GreatSQL
关注
GreatSQL社区 2023-01-31 加入
GreatSQL是由万里数据库维护的MySQL分支,专注于提升MGR可靠性及性能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版本。 社区:https://greatsql.cn/ Gitee: https://gitee.com/GreatSQL/GreatSQL










评论