cost 量化分析

2023-03-09
福建
本文字数：23662 字
阅读完需：约 78 分钟

GreatSQL 社区原创内容未经授权不得随意使用，转载请联系小编并注明来源。
GreatSQL 是 MySQL 的国产分支版本，使用上与 MySQL 一致。
作者： xryz
文章来源：GreatSQL 社区原创

前言：

我们在日常维护数据库的时候，经常会遇到查询慢的语句，这时候一般会通过执行 EXPLAIN 去查看它的执行计划，但是执行计划往往只给我们带来了最基础的分析信息，比如是否有使用索引，还有一些其他供我们分析的信息，比如使用了临时表、排序等等，却无法展示为什么一些其他的执行计划未被选择，比如说明明有索引，或者好几个索引，但是为什么查询时未使用到期望的索引等

explain select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+| id | select_type | table | partitions | type   | possible_keys                        | key           | key_len | ref            | rows | filtered | Extra                 |+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+|  1 | SIMPLE      | t2    | NULL       | range  | id_num_unique,idx_age,idx_age_id_num | idx_age       | 1       | NULL           | 9594 |   100.00 | Using index condition ||  1 | SIMPLE      | t1    | NULL       | eq_ref | id_num_unique,idx_age                | id_num_unique | 60      | test.t2.id_num |    1 |    50.00 | Using where           |+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+2 rows in set, 1 warning (0.01 sec)

复制代码

如上面这个例子，为什么 t2 表上列出了多个可能使用的索引，却选择了 idx_age，优化器为什么选择了指定的索引，这时候并不能直观的看出问题，这时候我们就可以开启 optimizer_trace 跟踪分析 MySQL 具体是怎么选择出最优的执行计划的。

OPTIMIZER_TRACE：

optimizer_trace 是什么：

optimizer_trace 是一个具有跟踪功能的工具，可以跟踪执行的语句的解析优化执行过程，并将跟踪到的信息记录到 INFORMATION_SCHEMA.OPTIMIZER_TRACE 表中，但是每个会话都只能跟踪它自己执行的语句，并且表中默认只记录最后一个查询的跟踪结果

使用方法：

# 打开optimizer trace功能 (默认情况下它是关闭的):set optimizer_trace="enabled=on";select ...; # 这里输入你自己的查询语句SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;# 当你停止查看语句的优化过程时，把optimizer trace功能关闭set optimizer_trace="enabled=off";

复制代码

optimizer_trace 表信息：

该表总共有 4 个字段

QUERY 表示我们的查询语句。
TRACE 表示优化过程的 JSON 格式文本。(重点关注)
MISSING_BYTES_BEYOND_MAX_MEM_SIZE 由于优化过程可能会输出很多，如果超过某个限制时，多余的文本将不会被显示，这个字段展示了被忽略的文本字节数。
INSUFFICIENT_PRIVILEGES 表示是否没有权限查看优化过程，默认值是 0，只有某些特殊情况下才会是 1，我们暂时不关心这个字段的值。

信息解读：

通过 optimizer_trace 表的 query 字段可以看到，一条语句的执行过程主要分为三个步骤：

"join_preparation": {},（准备阶段）"join_optimization": {},（优化阶段）"join_execution": {},（执行阶段）

复制代码

各个步骤的详细内容解读：

preparation：

expanded_query ：将语句进行格式化，补充隐藏的列名和表名等transformations_to_nested_joins ：查询重写，比如join的on改为where语句

复制代码

optimization：

condition_processing{ ：条件句处理。    transformation{：转换类型句。这三次的转换分别是        equality_propagation（等值条件句转换），如：a = b and b = c and c = 5        constant_propagation（常量条件句转换），如：a = 1 AND b > a        trivial_condition_removal（无效条件移除的转换），如：1 = 1    }}substitute_generated_columns ：替换虚拟生成列，测试了很多sql，这一列都没有看到有用的信息table_dependencies ：梳理表之间的依赖关系。ref_optimizer_key_uses ：如果优化器认为查询可以使用ref的话，在这里列出可以使用的索引。rows_estimation{ ：估算表行数和扫描的代价。如果查询中存在range扫描的话，对range扫描进行计划分析及代价估算。  table_scan：全表扫描的行数(rows)以及所需要的代价(cost)。  potential_range_indexes：该阶段会列出表中所有的索引并分析其是否可用，并且还会列出索引中可用的列字段。  analyzing_range_alternatives ：分析可选方案的代价。}considered_execution_plans{ ：对比各可行计划的代价，选择相对最优的执行计划。  plan_prefix：前置的执行计划。  best_access_path：当前最优的执行顺序信息结果集。  access_type表示使用索引的方式，可参照为explain中的type字段。  condition_filtering_pct：类似于explain中的filtered列，这是一个估算值。  rows_for_plan：该执行计划最终的扫描行数，这里的行数其实也是估算值，是由considered_access_paths的resulting_rows相乘之后再乘以condition_filtering_pct获得。  cost_for_plan：该执行计划的执行代价，由considered_access_paths的cost相加而得。  chosen：是否选择了该执行计划。}attaching_conditions_to_tables ：添加附加条件，使得条件尽可能筛选单表数据。refine_plan ：优化后的执行计划。

复制代码

trace 信息中的 json 信息很长，因为我们关心的是不同执行计划的 cost 区别，所以只需要重点关注两个部分 rows_estimation 和 considered_execution_plans

代价模型计算：

统计信息和 cost 计算参数：

计算 cost 会涉及到表的主键索引数据页(聚簇索引)数量和表中的记录数，两个信息都可以通过 innodb 的表统计信息 mysql.innodb_table_stats 查到，n_rows 是记录数，clustered_index_size 是聚簇索引页数。

mysql> select * from mysql.innodb_table_stats where table_name='basic_person_info';+---------------+-------------------+---------------------+--------+----------------------+--------------------------+| database_name | table_name        | last_update         | n_rows | clustered_index_size | sum_of_other_index_sizes |+---------------+-------------------+---------------------+--------+----------------------+--------------------------+| test          | basic_person_info | 2022-12-23 18:27:24 |  86632 |                  737 |                     1401 |+---------------+-------------------+---------------------+--------+----------------------+--------------------------+1 row in set (0.01 sec)

复制代码

代价模型将操作分为 Server 层和 Engine（存储引擎）层两类，Server 层主要是 CPU 代价，Engine 层主要是 IO 代价，比如 MySQL 从磁盘读取一个数据页的代价 io_block_read_cost 为 1，从 buffer pool 读取的代价 memory_block_read_cost 为 0.25，计算符合条件的行代价为 row_evaluate_cost 为 0.1，除此之外还有：

memory_temptable_create_cost (default 1.0) 内存临时表的创建代价。
memory_temptable_row_cost (default 0.1) 内存临时表的行代价。
key_compare_cost (default 0.1) 键比较的代价，例如排序。
disk_temptable_create_cost (default 20.0) 内部 myisam 或 innodb 临时表的创建代价。
disk_temptable_row_cost (default 0.5) 内部 myisam 或 innodb 临时表的行代价。

这些都可以通过 mysql.server_cost、mysql.engine_cost 查看 defalt 值和设置值

mysql> select * from mysql.server_cost;+------------------------------+------------+---------------------+---------+---------------+| cost_name                    | cost_value | last_update         | comment | default_value |+------------------------------+------------+---------------------+---------+---------------+| disk_temptable_create_cost   |       NULL | 2022-05-11 16:09:37 | NULL    |            20 || disk_temptable_row_cost      |       NULL | 2022-05-11 16:09:37 | NULL    |           0.5 || key_compare_cost             |       NULL | 2022-05-11 16:09:37 | NULL    |          0.05 || memory_temptable_create_cost |       NULL | 2022-05-11 16:09:37 | NULL    |             1 || memory_temptable_row_cost    |       NULL | 2022-05-11 16:09:37 | NULL    |           0.1 || row_evaluate_cost            |       NULL | 2022-05-11 16:09:37 | NULL    |           0.1 |+------------------------------+------------+---------------------+---------+---------------+mysql> select * from mysql.engine_cost;+-------------+-------------+------------------------+------------+---------------------+---------+---------------+| engine_name | device_type | cost_name              | cost_value | last_update         | comment | default_value |+-------------+-------------+------------------------+------------+---------------------+---------+---------------+| default     |           0 | io_block_read_cost     |       NULL | 2022-05-11 16:09:37 | NULL    |             1 || default     |           0 | memory_block_read_cost |       NULL | 2023-01-09 11:17:39 | NULL    |          0.25 |+-------------+-------------+------------------------+------------+---------------------+---------+---------------+

复制代码

计算公式：

如上面介绍的一样，代价模型将操作分为两类 io_cost 和 cpu_cost，io_cost+cpu_cost 就是总的 cost，下面是具体的计算方法：

全表扫描：

全表扫描成本 = io_cost + 1.1 + cpu_cost + 1 (io_cost +1.1 和 cpu_cost +1 在代码里是直接硬加上的，不知道为什么，计算的时候直接加上)

io_cost = clustered_index_size (统计信息中的主键页数) * avg_single_page_cost(读取一个页的平均成本)

avg_single_page_cost = pages_in_memory_percent * 0.25(memory_block_read_cost) + pages_on_disk_percent * 1.0(io_block_read_cost)

pages_in_memory_percent 表示已经加载到 Buffer Pool 中的叶结点占所有叶结点的比例 pages_on_disk_percent 表示没有加载到 Buffer Pool 中的叶结点占所有叶结点的比例

所以当数据已经全部读取到 buffer pool 中的时候：

io_cost=clustered_index_size * 0.25

都没有读取到 buffer pool 中的时候：

io_cost=clustered_index_size * 1.0

当部分数据在 buffer pool 中，部分数据需要从磁盘读取时，这时的系数介于 0.25 到 1 之间

cpu_cost = n_rows(统计信息中记录数) * 0.1(row_evaluate_cost)

走索引的成本：

和全表扫描的计算方法类似，其中 io_cost 与搜索的区间数有关，比如扫描三个区间 where a between 1 and 10 or a between 20 and 30 or a between 40 and 50，此时：

io_cost=3 * avg_single_page_cost

cpu_cost=记录数 * 0.1(row_evaluate_cost)+0.01(代码中的微调参数)

针对二级索引还会有回表的操作：

MySQL 认为每次回表都相当于是访问一个页面，所以每次回表都会进行一次 IO，这部分成本：

io_cost=rows(记录数)*avg_single_page_cost

对回表查询的数据还需要进行一次计算：

cpu_cost=rows(记录数) * 0.1(row_evaluate_cost)(需要注意的是当索引需要回表扫描时，在 rows_estimation 阶段并不会计算这个值，在 considered_execution_plans 阶段会重新加上这部分成本)

所以针对需要回表的查询：

io_cost=查询区间 * avg_single_page_cost + rows(记录数) * avg_single_page_cost

cpu_cost=记录数 * 0.1(row_evaluate_cost) + 0.01(代码中的微调参数) + rows(记录数) * 0.1(row_evaluate_cost)

例子：

mysql> set optimizer_trace='enabled=on';Query OK, 0 rows affected (0.00 sec)
mysql>explain select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+| id | select_type | table | partitions | type   | possible_keys                        | key           | key_len | ref            | rows | filtered | Extra                 |+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+|  1 | SIMPLE      | t2    | NULL       | range  | id_num_unique,idx_age,idx_age_id_num | idx_age       | 1       | NULL           | 9594 |   100.00 | Using index condition ||  1 | SIMPLE      | t1    | NULL       | eq_ref | id_num_unique,idx_age                | id_num_unique | 60      | test.t2.id_num |    1 |    50.00 | Using where           |+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+2 rows in set, 1 warning (0.04 sec)

复制代码

查看 optimizer_trace 的内容

select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20 | {  "steps": [    {      "join_preparation": {        "select#": 1,        "steps": [          {            "expanded_query": "/* select#1 */ select `t1`.`id` AS `id`,`t1`.`id_num` AS `id_num`,`t1`.`lastname` AS `lastname`,`t1`.`firstname` AS `firstname`,`t1`.`mobile` AS `mobile`,`t1`.`sex` AS `sex`,`t1`.`birthday` AS `birthday`,`t1`.`age` AS `age`,`t1`.`top_education` AS `top_education`,`t1`.`address` AS `address`,`t1`.`income_by_year` AS `income_by_year`,`t1`.`create_time` AS `create_time`,`t1`.`update_time` AS `update_time`,`t2`.`id` AS `id`,`t2`.`id_num` AS `id_num`,`t2`.`lastname` AS `lastname`,`t2`.`firstname` AS `firstname`,`t2`.`mobile` AS `mobile`,`t2`.`sex` AS `sex`,`t2`.`birthday` AS `birthday`,`t2`.`age` AS `age`,`t2`.`top_education` AS `top_education`,`t2`.`address` AS `address`,`t2`.`income_by_year` AS `income_by_year`,`t2`.`create_time` AS `create_time`,`t2`.`update_time` AS `update_time` from (`basic_person_info` `t1` join `basic_person_info2` `t2` on((`t1`.`id_num` = `t2`.`id_num`))) where ((`t1`.`age` > 10) and (`t2`.`age` < 20))"          },          {            "transformations_to_nested_joins": {              "transformations": [                "JOIN_condition_to_WHERE",                "parenthesis_removal"              ],              "expanded_query": "/* select#1 */ select `t1`.`id` AS `id`,`t1`.`id_num` AS `id_num`,`t1`.`lastname` AS `lastname`,`t1`.`firstname` AS `firstname`,`t1`.`mobile` AS `mobile`,`t1`.`sex` AS `sex`,`t1`.`birthday` AS `birthday`,`t1`.`age` AS `age`,`t1`.`top_education` AS `top_education`,`t1`.`address` AS `address`,`t1`.`income_by_year` AS `income_by_year`,`t1`.`create_time` AS `create_time`,`t1`.`update_time` AS `update_time`,`t2`.`id` AS `id`,`t2`.`id_num` AS `id_num`,`t2`.`lastname` AS `lastname`,`t2`.`firstname` AS `firstname`,`t2`.`mobile` AS `mobile`,`t2`.`sex` AS `sex`,`t2`.`birthday` AS `birthday`,`t2`.`age` AS `age`,`t2`.`top_education` AS `top_education`,`t2`.`address` AS `address`,`t2`.`income_by_year` AS `income_by_year`,`t2`.`create_time` AS `create_time`,`t2`.`update_time` AS `update_time` from `basic_person_info` `t1` join `basic_person_info2` `t2` where ((`t1`.`age` > 10) and (`t2`.`age` < 20) and (`t1`.`id_num` = `t2`.`id_num`))"            }          }        ]      }    },    {      "join_optimization": {        "select#": 1,        "steps": [          {            "condition_processing": {              "condition": "WHERE",              "original_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and (`t1`.`id_num` = `t2`.`id_num`))",              "steps": [                {                  "transformation": "equality_propagation",                  "resulting_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and multiple equal(`t1`.`id_num`, `t2`.`id_num`))"                },                {                  "transformation": "constant_propagation",                  "resulting_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and multiple equal(`t1`.`id_num`, `t2`.`id_num`))"                },                {                  "transformation": "trivial_condition_removal",                  "resulting_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and multiple equal(`t1`.`id_num`, `t2`.`id_num`))"                }              ]            }          },          {            "substitute_generated_columns": {            }          },          {            "table_dependencies": [              {                "table": "`basic_person_info` `t1`",                "row_may_be_null": false,                "map_bit": 0,                "depends_on_map_bits": [                ]              },              {                "table": "`basic_person_info2` `t2`",                "row_may_be_null": false,                "map_bit": 1,                "depends_on_map_bits": [                ]              }            ]          },          {            "ref_optimizer_key_uses": [              {                "table": "`basic_person_info` `t1`",                "field": "id_num",                "equals": "`t2`.`id_num`",                "null_rejecting": true              },              {                "table": "`basic_person_info2` `t2`",                "field": "id_num",                "equals": "`t1`.`id_num`",                "null_rejecting": true              }            ]          },          {            "rows_estimation": [              {                "table": "`basic_person_info` `t1`",                "range_analysis": {                  "table_scan": {                    "rows": 86734,                    "cost": 8859.75                                        t1表的scan成本=聚簇索引页数*0.25 + 行数 * 0.1 +1.1+1                    737*0.25+1.1+86734*0.1+1=8859.75                                      },                  "potential_range_indexes": [                    {                      "index": "PRIMARY",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "id_num_unique",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "mobile_unique",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_basic_person_info_name",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_basic_person_info_top_education",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_basic_person_info_create_time",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_basic_person_info_mobile",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_age",                      "usable": true,                      "key_parts": [                        "age",                        "id"                      ]                    }                  ],                  "setup_range_conditions": [                  ],                  "group_index_range": {                    "chosen": false,                    "cause": "not_single_table"                  },                  "skip_scan_range": {                    "chosen": false,                    "cause": "not_single_table"                  },                  "analyzing_range_alternatives": {                    "range_scan_alternatives": [                      {                        "index": "idx_age",                        "ranges": [                          "10 < age"                        ],                        "index_dives_for_eq_ranges": true,                        "rowid_ordered": false,                        "using_mrr": false,                        "index_only": false,                        "rows": 43367,                        "cost": 15178.7,                                                通过索引idx_age读取数据：                        io_cost=区间数* 0.25 +记录数* 0.25                        io_cost=1*0.25+43367*0.25=10,842                          cpu_cost=记录数* 0.1 (没有回表的cost)                        cpu_cost=43367*0.1=4,336.7                         cost=10842+4,336.7=15178.7                                                "chosen": false,                        "cause": "cost"                      }                    ],                    "analyzing_roworder_intersect": {                      "usable": false,                      "cause": "too_few_roworder_scans"                    }                  }                }              },              {                "table": "`basic_person_info2` `t2`",                "range_analysis": {                  "table_scan": {                    "rows": 73845,                    "cost": 7538.85                                        t2表的scan成本=聚簇索引页数*0.25 + 行数 * 0.1 +1.1+1                    609*0.25+1+73845*0.1+1.1=7538.85                                      },                  "potential_range_indexes": [                    {                      "index": "PRIMARY",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "id_num_unique",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "mobile_unique",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_basic_person_info_name",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_basic_person_info_top_education",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_basic_person_info_create_time",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_basic_person_info_mobile",                      "usable": false,                      "cause": "not_applicable"                    },                    {                      "index": "idx_age",                      "usable": true,                      "key_parts": [                        "age",                        "id"                      ]                    },                    {                      "index": "idx_age_id_num",                      "usable": true,                      "key_parts": [                        "age",                        "id_num",                        "id"                      ]                    }                  ],                  "setup_range_conditions": [                  ],                  "group_index_range": {                    "chosen": false,                    "cause": "not_single_table"                  },                  "skip_scan_range": {                    "chosen": false,                    "cause": "not_single_table"                  },                  "analyzing_range_alternatives": {                    "range_scan_alternatives": [                      {                        "index": "idx_age",                        "ranges": [                          "age < 20"                        ],                        "index_dives_for_eq_ranges": true,                        "rowid_ordered": false,                        "using_mrr": false,                        "index_only": false,                        "rows": 9594,                        "cost": 3358.16,                                                通过索引idx_age读取数据：                        io_cost=区间数* 0.25 +记录数* 0.25                        io_cost=1*0.25+9594*0.25=2,398.75                                cpu_cost=记录数* 0.1   (没有回表的cost)                         cpu_cost=9594*0.1959.4                          cost=2,398.75+959.4=3,358.15                                                "chosen": true                      },                      {                        "index": "idx_age_id_num",                        "ranges": [                          "age < 20"                        ],                        "index_dives_for_eq_ranges": true,                        "rowid_ordered": false,                        "using_mrr": false,                        "index_only": false,                        "rows": 19086,                        "cost": 6680.36,                                                通过索引idx_age_id_num读取数据：                        io_cost=区间数* 0.25 +记录数* 0.25                        io_cost=1*0.25+19086*0.25=4,771.75                                   cpu_cost=记录数* 0.1  (没有回表的cost)                          cpu_cost=19086*0.1=1908.6                        cost=4,771.75+1908.6=6,680.35                                                "chosen": false,                        "cause": "cost"                      }                    ],                    "analyzing_roworder_intersect": {                      "usable": false,                      "cause": "too_few_roworder_scans"                    }                  },                  "chosen_range_access_summary": {                    "range_access_plan": {                      "type": "range_scan",                      "index": "idx_age",                      "rows": 9594,                      "ranges": [                        "age < 20"                      ]                    },                    "rows_for_plan": 9594,                    "cost_for_plan": 3358.16,                    "chosen": true                  }                }              }            ]          },          {            "considered_execution_plans": [              {                "plan_prefix": [                ],                "table": "`basic_person_info2` `t2`",                "best_access_path": {                  "considered_access_paths": [                    {                      "access_type": "ref",                      "index": "id_num_unique",                      "usable": false,                      "chosen": false                    },                    {                      "rows_to_scan": 9594,                      "filtering_effect": [                      ],                      "final_filtering_effect": 1,                      "access_type": "range",                      "range_details": {                        "used_index": "idx_age"                      },                      "resulting_rows": 9594,                      "cost": 4317.56,                                              通过索引idx_age读取数据：                        io_cost=区间数* 0.25 +记录数* 0.25                        io_cost=1*0.25+9594*0.25=2,398.75                                cpu_cost=记录数* 0.1 + 记录数* 0.1                           cpu_cost=9594*0.1*2=1,918.8                          cost=2,398.75+1,918.8=4317.56                                            "chosen": true                    }                  ]                },                "condition_filtering_pct": 100,                "rows_for_plan": 9594,                "cost_for_plan": 4317.56,                "rest_of_plan": [                  {                    "plan_prefix": [                      "`basic_person_info2` `t2`"                    ],                    "table": "`basic_person_info` `t1`",                    "best_access_path": {                      "considered_access_paths": [                        {                          "access_type": "eq_ref",                          "index": "id_num_unique",                          "rows": 1,                          "cost": 3357.9,                                                    io_cost=t2表记录数*0.25=9594*0.25=2398.5                          cpu_cost=记录数*0.1=9594*0.1=959.4                          cost=2398.5+959.4=3357.9                                                    "chosen": true                        },                        {                          "rows_to_scan": 86734,                          "filtering_effect": [                          ],                          "final_filtering_effect": 0.5,                          "access_type": "scan",                          "using_join_cache": true,                          "buffers_needed": 14,                          "resulting_rows": 43367,                          "cost": 4.16701e+07,                          "chosen": false                        }                      ]                    },                    "condition_filtering_pct": 100,                    "rows_for_plan": 9594,                    "cost_for_plan": 7675.46,                                       总cost=4,317.56+3,357.9=7,675.46                                       "chosen": true                  }                ]              },              {                "plan_prefix": [                ],                "table": "`basic_person_info` `t1`",                "best_access_path": {                  "considered_access_paths": [                    {                      "access_type": "ref",                      "index": "id_num_unique",                      "usable": false,                      "chosen": false                    },                    {                      "rows_to_scan": 86734,                      "filtering_effect": [                      ],                      "final_filtering_effect": 0.5,                      "access_type": "scan",                      "resulting_rows": 43367,                      "cost": 8857.65,                                            t1的scan成本                                            "chosen": true                    }                  ]                },                "condition_filtering_pct": 100,                "rows_for_plan": 43367,                "cost_for_plan": 8857.65,                "pruned_by_cost": true                                放弃后续的计算                              }            ]          },          {            "attaching_conditions_to_tables": {              "original_condition": "((`t1`.`id_num` = `t2`.`id_num`) and (`t1`.`age` > 10) and (`t2`.`age` < 20))",              "attached_conditions_computation": [              ],              "attached_conditions_summary": [                {                  "table": "`basic_person_info2` `t2`",                  "attached": "(`t2`.`age` < 20)"                },                {                  "table": "`basic_person_info` `t1`",                  "attached": "((`t1`.`id_num` = `t2`.`id_num`) and (`t1`.`age` > 10))"                }              ]            }          },          {            "finalizing_table_conditions": [              {                "table": "`basic_person_info2` `t2`",                "original_table_condition": "(`t2`.`age` < 20)",                "final_table_condition   ": "(`t2`.`age` < 20)"              },              {                "table": "`basic_person_info` `t1`",                "original_table_condition": "((`t1`.`id_num` = `t2`.`id_num`) and (`t1`.`age` > 10))",                "final_table_condition   ": "(`t1`.`age` > 10)"              }            ]          },          {            "refine_plan": [              {                "table": "`basic_person_info2` `t2`",                "pushed_index_condition": "(`t2`.`age` < 20)",                "table_condition_attached": null              },              {                "table": "`basic_person_info` `t1`"              }            ]          }        ]      }    },    {      "join_execution": {        "select#": 1,        "steps": [        ]      }    }  ]}

复制代码

成本常数修改：

前面已经介绍了成本常量值实际上存放在 MySQL 自带的系统库 MySQL 中的 server_cost 和 engine_cost 表中，其中 server_cost 表存放 server 层的成本常量，engine_cost 表存放 engine 层成本常量

mysql> select * from mysql.server_cost;+------------------------------+------------+---------------------+---------+---------------+| cost_name                    | cost_value | last_update         | comment | default_value |+------------------------------+------------+---------------------+---------+---------------+| disk_temptable_create_cost   |       NULL | 2022-05-11 16:09:37 | NULL    |            20 || disk_temptable_row_cost      |       NULL | 2022-05-11 16:09:37 | NULL    |           0.5 || key_compare_cost             |       NULL | 2022-05-11 16:09:37 | NULL    |          0.05 || memory_temptable_create_cost |       NULL | 2022-05-11 16:09:37 | NULL    |             1 || memory_temptable_row_cost    |       NULL | 2022-05-11 16:09:37 | NULL    |           0.1 || row_evaluate_cost            |       NULL | 2022-05-11 16:09:37 | NULL    |           0.1 |+------------------------------+------------+---------------------+---------+---------------+
mysql> select * from mysql.engine_cost;+-------------+-------------+------------------------+------------+---------------------+---------+---------------+| engine_name | device_type | cost_name              | cost_value | last_update         | comment | default_value |+-------------+-------------+------------------------+------------+---------------------+---------+---------------+| default     |           0 | io_block_read_cost     |       NULL | 2022-05-11 16:09:37 | NULL    |             1 || default     |           0 | memory_block_read_cost |       NULL | 2023-01-09 11:17:39 | NULL    |          0.25 |+-------------+-------------+------------------------+------------+---------------------+---------+---------------+

复制代码

其中 default_value 的值是系统默认的，不能修改，cost_value 列的值我们可以修改，如果 cost_value 列的值不为空系统将用该值覆盖默认值，我们可以通过 update 语句来修改

mysql> update mysql.engine_cost set cost_value=10 where cost_name='memory_block_read_cost';Query OK, 0 rows affected (0.00 sec)mysql> update mysql.engine_cost set cost_value=10 where cost_name='io_block_read_cost';Query OK, 0 rows affected (0.00 sec)

复制代码

很多资料都说执行 flush optimizer_costs 就可以生效，不过我在修改完后并执行 flush optimizer_costs 并不能马上生效，最后是通过重启数据库实例才生效，这个可能是数据库版本的差异，大家可以自行验证。

mysql> explain select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+-------+----------+-------------+| id | select_type | table | partitions | type   | possible_keys                        | key           | key_len | ref            | rows  | filtered | Extra       |+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+-------+----------+-------------+|  1 | SIMPLE      | t2    | NULL       | ALL    | id_num_unique,idx_age,idx_age_id_num | NULL          | NULL    | NULL           | 73990 |    12.97 | Using where ||  1 | SIMPLE      | t1    | NULL       | eq_ref | id_num_unique,idx_age                | id_num_unique | 60      | test.t2.id_num |     1 |    50.00 | Using where |+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+-------+----------+-------------+
"table": "`basic_person_info2` `t2`",                "range_analysis": {                  "table_scan": {                    "rows": 73990,                    "cost": 13491.1                                      全表扫描cost=609*10+73990*0.1+1.1+1= 13491.1                                     },"index": "idx_age",                        "ranges": [                          "age < 20"                        ],                        "index_dives_for_eq_ranges": true,                        "rowid_ordered": false,                        "using_mrr": false,                        "index_only": false,                        "rows": 9594,                        "cost": 96909.4,                                                idx_age索引扫描cost=1*10+9594*10+9594*0.1=96,909.4                                                "chosen": false,                        "cause": "cost"                      },

复制代码

修改后的执行计划，发现 t2 表走了全表扫描了而没有走 idx_age 索引，分别查看一下 t2 表走全表扫描和 idx_age 索引的 cost 发现全表扫描的 cost 为 13491.1，而走索引的 cost 为 96,909.4，因为全表扫描的 cost 比走索引低，所以优化器没有选择 idx_age 索引。

从这个例子可以看出，更改成本常量值会直接影响优化器的方案选择，所以一定要慎重，没有特殊原因建议不要修改。

explain format=json

虽然通过 optimizer_trace 可以看到很多详细的优化器选择过程，但是使用起来起来还是比较麻烦，需要过滤的信息很多，这时 explain format=json 输出 json 格式的分析数据也是一个不错的选择，它也包含语句将要执行的成本信息，如下：

query_cost  总查询成本read_cost   IO成本+除 eval_cost以外cpu成本eval_cost   检测rows * filter条记录的成本prefix_cost 单次查询的成本,等于read_cost+eval_costmysql> explain format=json select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;{  "query_block": {    "select_id": 1,    "cost_info": {      "query_cost": "7675.46"    },    "nested_loop": [      {        "table": {          "table_name": "t2",          "access_type": "range",          "possible_keys": [            "id_num_unique",            "idx_age",            "idx_age_id_num"          ],          "key": "idx_age",          "used_key_parts": [            "age"          ],          "key_length": "1",          "rows_examined_per_scan": 9594,          "rows_produced_per_join": 9594,          "filtered": "100.00",          "index_condition": "(`test`.`t2`.`age` < 20)",          "cost_info": {            "read_cost": "3358.16",            包含所有io成本+(cpu成本-eval_cost)            "eval_cost": "959.40",            计算扇出的cpu成本，优化器利用启发式规则估算出满足所有条件的的比例(filtered)            =rows_examined_per_scan*filtered*0.1            "prefix_cost": "4317.56",            单表查询的总成本                        "data_read_per_join": "3M"          },          "used_columns": [            "id",            "id_num",            "lastname",            "firstname",            "mobile",            "sex",            "birthday",            "age",            "top_education",            "address",            "income_by_year",            "create_time",            "update_time"          ]        }      },      {        "table": {          "table_name": "t1",          "access_type": "eq_ref",          "possible_keys": [            "id_num_unique",            "idx_age"          ],          "key": "id_num_unique",          "used_key_parts": [            "id_num"          ],          "key_length": "60",          "ref": [            "test.t2.id_num"          ],          "rows_examined_per_scan": 1,          "rows_produced_per_join": 4797,          "filtered": "50.00",          "cost_info": {            "read_cost": "2398.50",            包含所有io成本+(cpu成本-eval_cost)            "eval_cost": "479.70",            计算扇出的cpu成本，优化器利用启发式规则估算出满足所有条件的的比例(filtered)            =rows_examined_per_scan*filtered*0.1            "prefix_cost": "7675.46",            两表查询的总cost            "data_read_per_join": "1M"          },          "used_columns": [            "id",            "id_num",            "lastname",            "firstname",            "mobile",            "sex",            "birthday",            "age",            "top_education",            "address",            "income_by_year",            "create_time",            "update_time"          ],          "attached_condition": "(`test`.`t1`.`age` > 10)"        }      }    ]  }}

复制代码

另外，explain 结合 show warnings 语句一起使用还可以得知优化器改写后的语句

mysql> show warnings;+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| Level | Code | Message                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| Note  | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,`test`.`t1`.`id_num` AS `id_num`,`test`.`t1`.`lastname` AS `lastname`,`test`.`t1`.`firstname` AS `firstname`,`test`.`t1`.`mobile` AS `mobile`,`test`.`t1`.`sex` AS `sex`,`test`.`t1`.`birthday` AS `birthday`,`test`.`t1`.`age` AS `age`,`test`.`t1`.`top_education` AS `top_education`,`test`.`t1`.`address` AS `address`,`test`.`t1`.`income_by_year` AS `income_by_year`,`test`.`t1`.`create_time` AS `create_time`,`test`.`t1`.`update_time` AS `update_time`,`test`.`t2`.`id` AS `id`,`test`.`t2`.`id_num` AS `id_num`,`test`.`t2`.`lastname` AS `lastname`,`test`.`t2`.`firstname` AS `firstname`,`test`.`t2`.`mobile` AS `mobile`,`test`.`t2`.`sex` AS `sex`,`test`.`t2`.`birthday` AS `birthday`,`test`.`t2`.`age` AS `age`,`test`.`t2`.`top_education` AS `top_education`,`test`.`t2`.`address` AS `address`,`test`.`t2`.`income_by_year` AS `income_by_year`,`test`.`t2`.`create_time` AS `create_time`,`test`.`t2`.`update_time` AS `update_time` from `test`.`basic_person_info` `t1` join `test`.`basic_person_info2` `t2` where ((`test`.`t1`.`id_num` = `test`.`t2`.`id_num`) and (`test`.`t1`.`age` > 10) and (`test`.`t2`.`age` < 20)) |+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+1 row in set (0.00 sec)

复制代码

总结：

MySQL 的优化器是基于成本来选择最优执行方案的，哪个成本最少就选哪个，所以重点在于计算出各个执行计划的 cost
成本由 CPU 成本和 IO 成本组成，每个成本常数值可以自己调整，非必要的情况下不要调整，以免影响整个数据库的执行计划选择
通过开启 optimizer_trace 可以跟踪优化器的各个环节的分析步骤，可以判断有时候为什么没有走索引而走了全表扫描
explain 加上 format=json 选项后可以查看成本信息分为 read_cost 和 eval_cost，但只能看到当前已经选择的执行计划，另外通过 show warnings 可以看到优化器改写后的语句

Enjoy GreatSQL :)

关于 GreatSQL

GreatSQL 是由万里数据库维护的 MySQL 分支，专注于提升 MGR 可靠性及性能，支持 InnoDB 并行查询特性，是适用于金融级应用的 MySQL 分支版本。

相关链接： GreatSQL社区 Gitee GitHub Bilibili

GreatSQL 社区：

社区博客有奖征稿详情：https://greatsql.cn/thread-100-1-1.html

技术交流群：

微信：扫码添加GreatSQL社区助手微信好友，发送验证信息加群。

发布于: 刚刚阅读数: 3

原文链接:【http://xie.infoq.cn/article/5df3a126ae5277612f90a92a0】。文章转载请联系作者。

GreatSQL

关注

GreatSQL社区 2023-01-31 加入

GreatSQL是由万里数据库维护的MySQL分支，专注于提升MGR可靠性及性能，支持InnoDB并行查询特性，是适用于金融级应用的MySQL分支版本。社区：https://greatsql.cn/ Gitee: https://gitee.com/GreatSQL/GreatSQL

发布

暂无评论

创作场景

cost 量化分析

前言：

OPTIMIZER_TRACE：

optimizer_trace 是什么：

使用方法：

相关参数：

optimizer_trace 表信息：

信息解读：

代价模型计算：

统计信息和 cost 计算参数：

计算公式：

全表扫描：

走索引的成本：

例子：

查看 optimizer_trace 的内容

成本常数修改：

explain format=json

总结：

关于 GreatSQL

GreatSQL 社区：

技术交流群：

GreatSQL

评论