小小林熬夜学编程

这个屌丝很懒，什么也没留下！

热门标签

mysql 查询优化器算法_MySQL查询优化器新特性介绍

作者：小小林熬夜学编程 | 2024-03-13 20:41:52

踩

optimizer_switch='index_merge=on,index_merge_intersection=off,block_nested_l

1.BNL(Block Nested-Loop)

2.BKA(Batch Key Access)

3.MRR(Multi Range Read)

4.ICP(Index Condition push)

序言：

我们使用MySQL，查询性能是一个很重要的指标，对于查询优化器如何使用索引？实际中用单表查询、多表join。对于单表查询好优化，多表join，从最原始的NL算法，到BNL(NL的升级版)，再到BKA(也是NL的升级版)。我们深入理解查询优化器的发展历程及新特性，下面就MySQL查询优化器的新特性着重讲解。

MySQL 5.6 开始引入许多新特性，如ICP、BKA、MRR等。

1.BNL(Block Nested-Loop)

优化对象：可以优化被join的表是ALL/index(全索引扫描)/Range。

BNL: 从MySQL 5.1就引入，到了5.6 由只支持join方式扩展到支持“外连接”、“半连接”，目的是利用join buffer减少内循环扫描的次数。

什么情况下可能用到BNL? 目前执行计划里type是：ALL/index/range 才可能使用 BNL算法。

怎么看查询优化器使用了BNL算法，而不是NL算法：

很明确的看的出来，在执行计划的Extra 列是：Using join buffer (Block Nested Loop)，就说明用了BNL。

下面是BNL的例子：

mysql> desc SELECT a.phone,a.loan_amount,a.duration FROM `xxx_order` as a straight_join(select order_id from xxx_order_xxx where order_id in('0000003fb42b4e8fa54335b7dcbd63e2','0000004f74b647cdab9a1d9b8163b01e','0000016b22e04a30bc2ae5ea9d49282a') ) as b where a.id=b.order_id\G*************************** 1. row ***************************id:1select_type:PRIMARY

table: a

type:ALLpossible_keys:PRIMARY

key: NULLkey_len:NULLref:NULLrows:34497996Extra:NULL

*************************** 2. row ***************************id:1select_type:PRIMARY

table: type:ALLpossible_keys:NULL

key: NULLkey_len:NULLref:NULLrows:3Extra: Usingwhere; Using joinbuffer (Block Nested Loop)*************************** 3. row ***************************id:2select_type: DERIVEDtable: xxx_order_xxx

type: range

possible_keys: order_idkey: order_id

key_len:96ref:NULLrows:3Extra: Usingwhere; Using indexmysql> DESC SELECT a.phone,a.loan_amount,a.duration FROM `xxx_order` as a straight_join xxx_order_xxx as b force index(xxx_order_xxx_2fb72c6e) where b.coupon_id in(184506,184508,184509) and a.id=b.order_id\G*************************** 1. row ***************************id:1select_type: SIMPLEtable: a

type:ALLpossible_keys:PRIMARY

key: NULLkey_len:NULLref:NULLrows:34498271Extra:NULL

*************************** 2. row ***************************id:1select_type: SIMPLEtable: b

type: range

possible_keys: xxx_order_xxx_2fb72c6ekey: xxx_order_xxx_2fb72c6e

key_len:5ref:NULLrows:3Extra: Usingindex condition; Using where; Using joinbuffer (Block Nested Loop)--下面是LEFT OUTER JOIN:

mysql> DESC SELECT a.phone,a.loan_amount,a.duration FROM `xxx_order` as a LEFT OUTER JOIN (select order_id from xxx_order_xxx where order_id in('0000003fb42b4e8fa54335b7dcbd63e2','0000004f74b647cdab9a1d9b8163b01e','0000016b22e04a30bc2ae5ea9d49282a') ) as b ON a.id=b.order_id\G*************************** 1. row ***************************id:1select_type:PRIMARY

table: a

type:ALLpossible_keys:NULL

key: NULLkey_len:NULLref:NULLrows:34498410Extra:NULL

*************************** 2. row ***************************id:1select_type:PRIMARY

table: type:ALLpossible_keys:NULL

key: NULLkey_len:NULLref:NULLrows:3Extra: Usingwhere; Using joinbuffer (Block Nested Loop)*************************** 3. row ***************************id:2select_type: DERIVEDtable: xxx_order_xxx

type: range

possible_keys: order_idkey: order_id

key_len:96ref:NULLrows:3Extra: Usingwhere; Using index

--下面是semi-join(自连接):

mysql> DESC SELECT a.phone,a.loan_amount,a.duration,b.duration FROM xxx_order as a,xxx_order as b where a.loan_amount=b.loan_amount\G*************************** 1. row ***************************id:1select_type: SIMPLEtable: a

type:ALLpossible_keys:NULL

key: NULLkey_len:NULLref:NULLrows:34498593Extra:NULL

*************************** 2. row ***************************id:1select_type: SIMPLEtable: b

type:ALLpossible_keys:NULL

key: NULLkey_len:NULLref:NULLrows:34498593Extra: Usingwhere; Using join buffer (Block Nested Loop)

下面NL、BNL算法伪代码来自互连接网：

Nested-Loop(NL) 的伪算法如下:for each row int1 matching range {for each row in t2 matching reference key{for each row int3 {if row satisfies joinconditions,

sendtoclient

}

Block Nested-Loop(BNL) Join算法如下：for each row int1 matching range {for each row in t2 matching reference key{

store used columnsfrom t1, t2 in joinbufferif buffer is full{for each row int3 {for each t1, t2 combination in joinbuffer {if row satisfies joinconditions,

sendtoclient

}

empty buffer

}

}if buffer is notempty {for each row int3 {for each t1, t2 combination in joinbuffer {if row satisfies joinconditions,

sendtoclient

}

如何更好使用优化后的BNL算法：

mysql> show variables like 'optimizer_switch';| optimizer_switch | index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=

on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,looses

can=on,firstmatch=on,subquery_materialization_cost_based=on,use_index_extensions=on |把 block_nested_loop=on 即可。

如何更好的使用到BNL：

join_buffer_size 能一次容纳外层所有的元组(col1,col2,........) ，适当增大join_buffer_size，但是join_buffer_size>＝所有元组size，再增加join_buffer_size不会再提升查询速度。

2.BKA(Batch Key Access)

优化对象：

被join的表有索引(type: ref/eq_ref 可以用到BKA)，与BNL 的区别:被join的表无索引或者走的全索引才使用使用BNL，否则可能使用BKA。

什么情况下可能用到BKA?

目前被join的表执行计划里type是：ref、eq_ref 才可能使用BKA。

Extra: Using join buffer (Batched Key Access) 此时用的BKA，而不是BNL。

打开BKA(因BKA下层要用MRR，所以MRR也要同时打开)：

mysql> set optimizer_switch="mrr=on,mrr_cost_based=off,batched_key_access=on";

下面是使用BKA的sql例子：

mysql> DESC SELECT a.phone,a.loan_amount,a.duration FROM `xxx_order` as a LEFT OUTER JOIN xxx_order_xxx as b ON a.id=b.order_id where a.user_id>b.device_type\G*************************** 1. row ***************************id:1select_type: SIMPLEtable: b

type:ALLpossible_keys: order_idkey: NULLkey_len:NULLref:NULLrows:34305976Extra: Usingwhere

*************************** 2. row ***************************id:1select_type: SIMPLEtable: a

type: eq_ref

possible_keys:PRIMARY,xxx_order_user_id_482c26488354e9c3_fk_users_account_idkey: PRIMARYkey_len:96ref: hydra.b.order_id

rows:1Extra: Usingwhere

BKA 会调用MRR，BKA批量处理被join表的type是：ref、eq_ref ，这里回表读取批量的数据可能会有很多随机IO，BKA很聪明，这个时候利用MRR，通过索引列排序主键值，由随机回表IO转成顺序IO。

如何更好的使用BKA这一新特性：主要在MRR，所以适当增大参数 read_rnd_buffer_size 的值。

3.MRR(Multi Range Read)

优化对象：

使用到二级索引的范围Range查询、ref/eq_ref 使用BKA的时候也可能用到MRR。

下面就用这个sql来说明MRR：

select non_key_col1,non_key_col2,... from tb where key_col=x;

没有MRR之前二级索引的查询是这样的：(下面2个图截自互联网)

第一步先根据where条件中的二级索引获取辅助索引与主键的集合，结果集为r。select key_col, pk_col from tx where key_col=x order by key_col; --使用sort buffer

第二步通过第一步获取的主键来获取对应的值。for each pk_col_value inr do:select non_key_col1,non_key_col2,... from tb where pk_col=pk_col_value;

mrr_1.png?version=1&modificationDate=1498141627000&api=v2

有MRR之后二级索引的查询是这样的：

第一步先根据where条件中的二级索引获取辅助索引与主键的集合，结果集为r1。select key_col, pk_col from tx where key_col=x order bykey_col;

第二步将结果集r1放在buffer里面(直到read_rnd_buffer_size 满)，然后对结果集r1按照pk_col排序，得到结果集是r2

第三步利用已经排序过的结果集，访问表中的数据，此时是顺序IO.select non_key_col1,non_key_col2,... from tx where pk_col in (r2);

mrr_2.png?version=1&modificationDate=1498141663000&api=v2

下面是使用MRR 的例子:

mysql> desc select * from xxx_order where created_time>'2017-6-22 10:00:00' and created_time<='2017-6-22 10:00:01'\G*************************** 1. row ***************************id:1select_type: SIMPLEtable: xxx_order

type: range

possible_keys: xxx_order_created_time_12bbdaa144debf81_uniqkey: xxx_order_created_time_12bbdaa144debf81_uniq

key_len:8ref:NULLrows:3Extra: Usingindexcondition--上面没有使用到MRR，把mrr=on,mrr_cost_based=off后使用到MRR，如下：

mysql> set optimizer_switch="index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=off,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,subquery_materialization_cost_based=on,use_index_extensions=on";

Query OK,0 rows affected (0.00sec)

type: range

possible_keys: xxx_order_created_time_12bbdaa144debf81_uniqkey: xxx_order_created_time_12bbdaa144debf81_uniq

key_len:8ref:NULLrows:3Extra: Usingindex condition

如何更好的使用MRR这一新特性：适当增大参数 read_rnd_buffer_size 的值。

4.ICP(Index Condition push)

优化的对象： type是ref、eq_ref、range、ref_or_null 且当需要读取full table rows，5.6只支持InnoDB、MyISA表，5.7开始支持分区表。

怎么看sql是否使用了ICP：执行计划Extra： Using index condition，说明使用到了ICP。

ICP的好处：减少引擎层访问基表的次数，减少Server层访问引擎层的次数。目的减少了内部IO的数量。

下面以sql：SELECT * FROM tx WHERE key_col='x' and no_key_col1>'x' and no_key_col2

常识：数据在引擎层，要传递给Server层。

没有ICP,优化器的算法是这样：

a)引擎层通过索引(这里是是key_col)回表方式读取包含所有列的一行。

b)引擎层读取到的行传给Server层，Server层通过where后面所有条件过滤，抛弃不满足条件的行。

c)重复上述a、b，直到所有行都过滤完毕。

使用ICP后,优化器的算法是这样：

a)引擎层读取下一行的索引项(这里是是key_col)，check在where中的使用到索引的列，再通过索引过滤，如果此索引项的值满足索引列的where条件，则使用该索引项回表方式读取整行数据。

b)引擎层读取到的行传给Server层，Server层再通过where后面剩下条件(这里是no_key_col1>'x' and no_key_col2

c)重复上述a、b，直到所有行都过滤完毕。

下面是使用了ICP的sql例子：

mysql> DESC SELECT a.phone,a.loan_amount,a.duration FROM `xxx_order` as a where finish_repay_time>='2017-05-01 22:41:40.81816' and finish_repay_time

type: range

possible_keys: xxx_order_finish_repay_time_896d8b11358405_uniqkey: xxx_order_finish_repay_time_896d8b11358405_uniq

key_len:9ref:NULLrows:1Extra: Usingindex condition

打开ICP：SET optimizer_switch = 'index_condition_pushdown=on';

总结：

1.BKA、MRR的关系如下图：

BKA2.png?version=1&modificationDate=1498196475750&api=v2

BKA下层是通过调用MRR，BKA利用索引批量循环，调用MRR目的是批量回表数据由随机IO转成顺序IO。

参考资料：

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/小小林熬夜学编程/article/detail/230476