写点什么

openGauss 内核分析:执行计划生成

  • 2022 年 8 月 29 日
    中国香港
  • 本文字数:2571 字

    阅读完需:约 8 分钟

openGauss内核分析:执行计划生成

本文分享自华为云社区《openGauss内核分析(六) 执行计划生成》,作者:Gauss 松鼠会。


SQL 语句解析完成后被解析成 Query 结构,在进行优化时是以 Query 为单位进行的,Query 的优化分为基于规则的逻辑优化(查询重写)和基于代价的物理优化(计划生成),主入口函数为 subquery_planner。subquery_planner 函数接收 Query(查询树),返回一个 Plan(计划树)。


Plan* subquery_planner(PlannerGlobal* glob, Query* parse, PlannerInfo* parent_root, bool hasRecursion,    double tuple_fraction, PlannerInfo** subroot, int options, ItstDisKey* diskeys, List* subqueryRestrictInfo){    PlannerInfo* root = NULL;Plan* plan = NULL; //返回结果preprocess_const_params(root, (Node*)parse->jointree); // 常数替换等式    if (parse->hasSubLinks) {        pull_up_sublinks(root); //提升子链接        DEBUG_QRW("After sublink pullup");    } /* Reduce orderby clause in subquery for join */    reduce_orderby(parse, false); //减少orderby    DEBUG_QRW("After order by reduce");    if (u_sess->attr.attr_sql.enable_constraint_optimization) {        removeNotNullTest(root); //删除NotNullTest        DEBUG_QRW("After soft constraint removal");    }    if ((LAZY_AGG & u_sess->attr.attr_sql.rewrite_rule) && permit_from_rewrite_hint(root, LAZY_AGG)) {        lazyagg_main(parse); // lazyagg重写        DEBUG_QRW("After lazyagg");}parse->jointree = (FromExpr*)pull_up_subqueries(root, (Node*)parse->jointree); //提升子查询    if (parse->setOperations) {        flatten_simple_union_all(root); //UNIONALL优化        DEBUG_QRW("After simple union all flatten");}expand_inherited_tables(root); //展开继承表parse->targetList = (List*)preprocess_expression(root, (Node*)parse->targetList, EXPRKIND_TARGET); //预处理表达式parse->havingQual = (Node *) newHaving; //处理HAVING子句        reduce_outer_joins(root); //外连接消除                reduce_inequality_fulljoins(root); //全连接重写plan = grouping_planner(root, tuple_fraction); //主要的计划过程return plan;}
复制代码


subquery_planner 函数由函数 standard_planner 调用,standard_planner 函数由 exec_simple_query->pg_plan_queries->pg_plan_query->planner 函数调用。standard_planner 将 Query(查询树)生成规划好的语句,可用于执行器实际执行。


PlannedStmt* standard_planner(Query* parse, int cursorOptions, ParamListInfo boundParams){    PlannedStmt* result = NULL;  //返回结果PlannerGlobal* glob = NULL;  double tuple_fraction;    PlannerInfo* root = NULL;Plan* top_plan = NULL;    glob = makeNode(PlannerGlobal);  /* primary planning entry point (may recurse for subqueries) */top_plan = subquery_planner(glob, parse, NULL, false, tuple_fraction, &root);  //主规划过程入口 /* build the PlannedStmt result */result = makeNode(PlannedStmt);  //构造PlannedStmtresult->commandType = parse->commandType;result->queryId = parse->queryId;result->uniqueSQLId = parse->uniqueSQLId;result->hasReturning = (parse->returningList != NIL);result->hasModifyingCTE = parse->hasModifyingCTE;result->canSetTag = parse->canSetTag;result->transientPlan = glob->transientPlan;result->dependsOnRole = glob->dependsOnRole;result->planTree = top_plan; //执行计划result->rtable = glob->finalrtable;result->resultRelations = glob->resultRelations;    return result;}
复制代码


仍然以前文的 join 列子来说明


SELECT * FROM t1 inner JOIN t2 ON t1.c1 = t2.c1;
复制代码


在 planner 函数打断点,用 gdb 查看 standard_planner 返回的 PlannedStmt


(gdb) bt#0  planner (parse=0x7fd93a410288, cursorOptions=0, boundParams=0x0) at planner.cpp:389#1  0x0000000001936fbd in pg_plan_query (querytree=0x7fd93a410288, cursorOptions=0, boundParams=0x0, underExplain=false) at postgres.cpp:1197#2  0x0000000001937381 in pg_plan_queries (querytrees=0x7fd939b81090, cursorOptions=0, boundParams=0x0) at postgres.cpp:1315#3  0x000000000193a6b8 in exec_simple_query (query_string=0x7fd966ad2060 "SELECT * FROM t1 inner JOIN t2 ON t1.c1 = t2.c1;", messageType=QUERY_MESSAGE, msg=0x7fd931056210)    at postgres.cpp:2560#4  0x0000000001947104 in PostgresMain (argc=1, argv=0x7fd93a2cf1c0, dbname=0x7fd93a2ce1f8 "postgres", username=0x7fd93a2ce1b0 "test") at postgres.cpp:8403#5  0x0000000001890740 in BackendRun (port=0x7fd931056720) at postmaster.cpp:8053#6  0x00000000018a00b1 in GaussDbThreadMain<(knl_thread_role)1> (arg=0x7fd97c55c5f0) at postmaster.cpp:12181#7  0x000000000189c0de in InternalThreadFunc (args=0x7fd97c55c5f0) at postmaster.cpp:12755#8  0x00000000024bf7d8 in ThreadStarterFunc (arg=0x7fd97c55c5e0) at gs_thread.cpp:382#9  0x00007fd9a60cfdd5 in start_thread () from /lib64/libpthread.so.0#10 0x00007fd9a5df8ead in clone () from /lib64/libc.so.6(gdb) p *result$14 = {type = T_PlannedStmt, commandType = CMD_SELECT, queryId = 0, hasReturning = false, hasModifyingCTE = false, canSetTag = true, transientPlan = false, dependsOnRole = false,  planTree = 0x7fd93a409d58, rtable = 0x7fd939b81660, …}(gdb) p *result->planTree->lefttree$46 = {type = T_SeqScan, plan_node_id = 2, parent_node_id = 1, exec_type = EXEC_ON_DATANODES, startup_cost = 0, total_cost = 1.03, plan_rows = 3, multiple = 1, plan_width = 8,…}
复制代码


将 Query 规划后得到 PlannedStmt




可以看到,Plannedstmt 与 explain 执行计划是一致的



点击关注,第一时间了解华为云新鲜技术~

发布于: 刚刚阅读数: 3
用户头像

提供全面深入的云计算技术干货 2020.07.14 加入

华为云开发者社区,提供全面深入的云计算前景分析、丰富的技术干货、程序样例,分享华为云前沿资讯动态,方便开发者快速成长与发展,欢迎提问、互动,多方位了解云计算! 传送门:https://bbs.huaweicloud.com/

评论

发布
暂无评论
openGauss内核分析:执行计划生成_数据库_华为云开发者联盟_InfoQ写作社区