写点什么

使用 TPC-H 进行 GreatSQL 并行查询测试

作者:GreatSQL
  • 2023-05-04
    福建
  • 本文字数:7928 字

    阅读完需:约 26 分钟

准备工作

数据库版本

GreatSQL-8.0.25-17

生成数据

使用 TPC-H 生成数据


#TPC-H Population Generator (Version 3.0.0)
#生成10G的数据$ ./dbgen -vf -s 10
复制代码

修改 my.cnf

vim /etc/my.cnf 
#设置IPB为8Ginnodb_buffer_pool_size = 8G
#设置并行查询的使用最大内存(此处为8G,根据具体配置设置)parallel_memory_limit= 8G
#打开并行查询force_parallel_execute=1
#设置双1(方便导入数据)innodb_flush_log_at_trx_commit = 1 sync_binlog = 1
#关闭binlogskip-log_bin
datadir = /data/GreatSQLsocket = mysql.sock
复制代码


启动数据库后,可以检查配置是否生效


mysql> show variables like '%double%';mysql> show variables like 'log_bin';mysql> show variables like 'sync_binlog';mysql> show variables like 'innodb_flush_log_at_trx_commit';mysql> show variables like 'innodb_buffer_pool_size';
复制代码


并行查询相关参数


mysql> show global variables like '%parall%';+----------------------------------+----------------+| force_parallel_execute           | ON             || innodb_parallel_dblwr_encrypt    | OFF            || innodb_parallel_doublewrite_path | xb_doublewrite || innodb_parallel_read_threads     | 4              || parallel_cost_threshold          | 1000           || parallel_default_dop             | 4              || parallel_max_threads             | 64             || parallel_memory_limit            | 8589934592     || parallel_queue_timeout           | 0              || slave_parallel_type              | LOGICAL_CLOCK  || slave_parallel_workers           | 2              |+----------------------------------+----------------+11 rows in set (0.01 sec)
复制代码

启动数据库

启动数据库:


$ systemctl start greatsql.service 
复制代码

文件准备

本次的工作在/data/tpch


可执行程序为 dbgen,依赖一个数据分布文件 dists.dss。可以将 dbgen 和 dists.dss 拷贝到同一目录使用


dss.ddl dss.ri 文件


准备表结构和索引文件 dss.ddldss.ri 到工作目录


$ cd /data/tpch/tpch_2.18.0/dbgen$ cp dss.ri /data/tpch/$ cp dss.ddl /data/tpch/
复制代码


load.sql


修改 load.sql 文件 修改文件的路径


$ cd /data/tpch/$ cp load.sql loadfix.sql $ vim loadfix.sql
复制代码

导入数据

$ /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock
#创建数据库mysql> create database tpch;mysql> use tpch;
#导入表结构mysql> source /data/tpch/dss.ddl;
#导入数据(文件见附录)mysql> sh loadfile
#导入索引、外键等mysql> source /data/tpch/dssfix.ri
复制代码


注:binlog 要关再导入,否则 binlog 会爆


Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage; increase this mysqld variable and try again
复制代码


查看导入的表(20G 数据)


mysql> select table_name,table_rows from information_schema.tables where table_name in ('customer','lineitem','nation','orders','part','partsupp','region','supplier');
+------------+------------+| TABLE_NAME | TABLE_ROWS |+------------+------------+| region | 5 || nation | 25 || part | 3860136 || supplier | 197853 || customer | 2884322 || partsupp | 17084176 || orders | 29678499 || lineitem | 87786966 |+------------+------------+8 rows in set (0.00 sec)
复制代码

编写并运行测试脚本

#测试脚本见附录#这里是在tmux中运行,避免因为终端关闭导致测试终止$ sh auto.sh&
#测试结束后在当前脚本的目录查看生成的日志$ cat tpch-PQ-******.log
复制代码


测试运行时,观察相关指标。


mysql> show global status like '%PQ%';+--------------------+-------+| Variable_name      | Value |+--------------------+-------+| PQ_memory_refused  | 0     || PQ_memory_used     | 0     || PQ_threads_refused | 0     || PQ_threads_running | 0     |+--------------------+-------+4 rows in set (0.00 sec)

mysql> show processlist;mysql> explain for connection **;
复制代码

测试结果

开启并行查询(16 线程)的执行时间,与不开启并行查询的执行时间如下:



注:本文章重点讲测试过程,具体的测试结果就不展开了。


****


附录-相关文件

导入脚本

$ cat loadfile/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/region.tbl' into table region FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/nation.tbl' into table nation FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/supplier.tbl' into table supplier FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/part.tbl' into table part FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/customer.tbl' into table customer FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/partsupp.tbl' into table partsupp FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/orders.tbl' into table orders FIELDS TERMINATED BY '|';" tpch &/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/lineitem.tbl' into table lineitem FIELDS TERMINATED BY '|';" tpch &
复制代码

测试脚本:

脚本是东拼西凑的,写的不好,希望有大佬能指导一下


$ cat auto.sh #include <iostream>TH=$PATH:/usr/local/binexport PATH#set -u#set -x#set -e. ~/.bash_profile > /dev/null 2>&1exec 3>&1 4>&2 1>> tpch-PQ-`date +'%Y%m%d%H%M%S'`.log 2>&1
# 定义要执行的SQL文件存放的目录SQL_DIR="/data/tpch/SQLs"
# 判断目录是否存在if [ ! -d "$SQL_DIR" ]; then echo "SQL文件目录不存在!" exit 1fi
# 进入SQL文件目录cd $SQL_DIR

I=1II=3while [ $I -le $II ]do # 执行SQL文件 for file in `ls *.sql` do echo "正在执行:$file" time /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock -Dtpch < $file echo "SQL:$file,执行完成" echo -e echo "休息100s" sleep 100 echo -e done echo "第$I次循环执行完成!"I=`expr $I + 1`doneecho "脚本结束"
复制代码

dss.ddl

-- Sccsid:     @(#)dss.ddl    2.1.8.1drop database tpch;create database tpch;use tpch;CREATE TABLE NATION  ( N_NATIONKEY  INTEGER NOT NULL,                            N_NAME       CHAR(25) NOT NULL,                            N_REGIONKEY  INTEGER NOT NULL,                            N_COMMENT    VARCHAR(152));
CREATE TABLE REGION ( R_REGIONKEY INTEGER NOT NULL, R_NAME CHAR(25) NOT NULL, R_COMMENT VARCHAR(152));
CREATE TABLE PART ( P_PARTKEY INTEGER NOT NULL, P_NAME VARCHAR(55) NOT NULL, P_MFGR CHAR(25) NOT NULL, P_BRAND CHAR(10) NOT NULL, P_TYPE VARCHAR(25) NOT NULL, P_SIZE INTEGER NOT NULL, P_CONTAINER CHAR(10) NOT NULL, P_RETAILPRICE DECIMAL(15,2) NOT NULL, P_COMMENT VARCHAR(23) NOT NULL );
CREATE TABLE SUPPLIER ( S_SUPPKEY INTEGER NOT NULL, S_NAME CHAR(25) NOT NULL, S_ADDRESS VARCHAR(40) NOT NULL, S_NATIONKEY INTEGER NOT NULL, S_PHONE CHAR(15) NOT NULL, S_ACCTBAL DECIMAL(15,2) NOT NULL, S_COMMENT VARCHAR(101) NOT NULL);
CREATE TABLE PARTSUPP ( PS_PARTKEY INTEGER NOT NULL, PS_SUPPKEY INTEGER NOT NULL, PS_AVAILQTY INTEGER NOT NULL, PS_SUPPLYCOST DECIMAL(15,2) NOT NULL, PS_COMMENT VARCHAR(199) NOT NULL );
CREATE TABLE CUSTOMER ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) NOT NULL, C_ADDRESS VARCHAR(40) NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) NOT NULL, C_COMMENT VARCHAR(117) NOT NULL);
CREATE TABLE ORDERS ( O_ORDERKEY INTEGER NOT NULL, O_CUSTKEY INTEGER NOT NULL, O_ORDERSTATUS CHAR(1) NOT NULL, O_TOTALPRICE DECIMAL(15,2) NOT NULL, O_ORDERDATE DATE NOT NULL, O_ORDERPRIORITY CHAR(15) NOT NULL, O_CLERK CHAR(15) NOT NULL, O_SHIPPRIORITY INTEGER NOT NULL, O_COMMENT VARCHAR(79) NOT NULL);
CREATE TABLE LINEITEM ( L_ORDERKEY INTEGER NOT NULL, L_PARTKEY INTEGER NOT NULL, L_SUPPKEY INTEGER NOT NULL, L_LINENUMBER INTEGER NOT NULL, L_QUANTITY DECIMAL(15,2) NOT NULL, L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL, L_DISCOUNT DECIMAL(15,2) NOT NULL, L_TAX DECIMAL(15,2) NOT NULL, L_RETURNFLAG CHAR(1) NOT NULL, L_LINESTATUS CHAR(1) NOT NULL, L_SHIPDATE DATE NOT NULL, L_COMMITDATE DATE NOT NULL, L_RECEIPTDATE DATE NOT NULL, L_SHIPINSTRUCT CHAR(25) NOT NULL, L_SHIPMODE CHAR(10) NOT NULL, L_COMMENT VARCHAR(44) NOT NULL);
复制代码

dss.ri

-- Sccsid:     @(#)dss.ri    2.1.8.1-- tpch Benchmark Version 8.0
-- For table REGIONALTER TABLE tpch.REGIONADD PRIMARY KEY (R_REGIONKEY);
-- For table NATIONALTER TABLE tpch.NATIONADD PRIMARY KEY (N_NATIONKEY);
ALTER TABLE tpch.NATIONADD FOREIGN KEY NATION_FK1 (N_REGIONKEY) references tpch.REGION(R_REGIONKEY);
COMMIT WORK;
-- For table PARTALTER TABLE tpch.PARTADD PRIMARY KEY (P_PARTKEY);
COMMIT WORK;
-- For table SUPPLIERALTER TABLE tpch.SUPPLIERADD PRIMARY KEY (S_SUPPKEY);
ALTER TABLE tpch.SUPPLIERADD FOREIGN KEY SUPPLIER_FK1 (S_NATIONKEY) references tpch.NATION(N_NATIONKEY);
COMMIT WORK;
-- For table PARTSUPPALTER TABLE tpch.PARTSUPPADD PRIMARY KEY (PS_PARTKEY,PS_SUPPKEY);
COMMIT WORK;
-- For table CUSTOMERALTER TABLE tpch.CUSTOMERADD PRIMARY KEY (C_CUSTKEY);
ALTER TABLE tpch.CUSTOMERADD FOREIGN KEY CUSTOMER_FK1 (C_NATIONKEY) references tpch.NATION(N_NATIONKEY);
COMMIT WORK;
-- For table LINEITEMALTER TABLE tpch.LINEITEMADD PRIMARY KEY (L_ORDERKEY,L_LINENUMBER);
COMMIT WORK;
-- For table ORDERSALTER TABLE tpch.ORDERSADD PRIMARY KEY (O_ORDERKEY);
COMMIT WORK;
-- For table PARTSUPPALTER TABLE tpch.PARTSUPPADD FOREIGN KEY PARTSUPP_FK1 (PS_SUPPKEY) references tpch.SUPPLIER(S_SUPPKEY);
COMMIT WORK;
ALTER TABLE tpch.PARTSUPPADD FOREIGN KEY PARTSUPP_FK2 (PS_PARTKEY) references tpch.PART(P_PARTKEY);
COMMIT WORK;
-- For table ORDERSALTER TABLE tpch.ORDERSADD FOREIGN KEY ORDERS_FK1 (O_CUSTKEY) references tpch.CUSTOMER(C_CUSTKEY);
COMMIT WORK;
-- For table LINEITEMALTER TABLE tpch.LINEITEMADD FOREIGN KEY LINEITEM_FK1 (L_ORDERKEY) references tpch.ORDERS(O_ORDERKEY);
COMMIT WORK;
ALTER TABLE tpch.LINEITEMADD FOREIGN KEY LINEITEM_FK2 (L_PARTKEY,L_SUPPKEY) references tpch.PARTSUPP(PS_PARTKEY,PS_SUPPKEY);
COMMIT WORK;
复制代码

SQL 语句

--SQL1 select /*+ PQ(16) */    l_returnflag,    l_linestatus,    sum(l_quantity) as sum_qty,    sum(l_extendedprice) as sum_base_price,    sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,    avg(l_quantity) as avg_qty,    avg(l_extendedprice) as avg_price,    avg(l_discount) as avg_disc,    count(*) as count_orderfrom    lineitemwhere    l_shipdate <= date '1998-12-01' - interval '88' daygroup by    l_returnflag,    l_linestatusorder by    l_returnflag,    l_linestatus limit 1;
--SQL3select /*+ PQ(16) */ l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriorityfrom customer, orders, lineitemwhere c_mktsegment = 'MACHINERY' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date '1995-03-01' and l_shipdate > date '1995-03-01'group by l_orderkey, o_orderdate, o_shippriorityorder by revenue desc, o_orderdate limit 10;

--SQL6select /*+ PQ(16) */ sum(l_extendedprice * l_discount) as revenuefrom lineitemwhere l_shipdate >= date '1993-01-01' and l_shipdate < date '1993-01-01' + interval '1' year and l_discount between 0.02 - 0.01 and 0.02 + 0.01 and l_quantity < 24 limit 1;


--SQL10select /*+ PQ(16) */ c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_commentfrom customer, orders, lineitem, nationwhere c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= date '1994-05-01' and o_orderdate < date '1994-05-01' + interval '3' month and l_returnflag = 'R' and c_nationkey = n_nationkeygroup by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_commentorder by revenue desc limit 20;




--SQL12select /*+ PQ(16) */ l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0 end) as low_line_countfrom orders, lineitemwhere o_orderkey = l_orderkey and l_shipmode in ('TRUCK', 'FOB') and l_commitdate < l_receiptdate and l_shipdate < l_commitdate and l_receiptdate >= date '1996-01-01' and l_receiptdate < date '1996-01-01' + interval '1' yeargroup by l_shipmodeorder by l_shipmode limit 1 ;


--SQL19select /*+ PQ(16) */ sum(l_extendedprice* (1 - l_discount)) as revenuefrom lineitem, partwhere ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 10 and l_quantity <= 10 + 10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#22' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 15 and l_quantity <= 15 + 10 and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = 'Brand#43' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 22 and l_quantity <= 22 + 10 and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) limit 1 ;
复制代码

相关问题

参考资料


https://imysql.com/2012/12/21/tpch-for-mysql-manual.html

发布于: 刚刚阅读数: 2
用户头像

GreatSQL

关注

GreatSQL社区 2023-01-31 加入

GreatSQL是由万里数据库维护的MySQL分支,专注于提升MGR可靠性及性能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版本。 社区:https://greatsql.cn/ Gitee: https://gitee.com/GreatSQL/GreatSQL

评论

发布
暂无评论
使用TPC-H 进行GreatSQL并行查询测试_MySQL_GreatSQL_InfoQ写作社区