当前位置:   article > 正文

Postgresql中order by limit 索引选择错误导致查询很慢_pgsql order by 耗时极长

pgsql order by 耗时极长

开发今天发来一条SQL,说执行很慢,出不来结果,如下:

select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 10;
  • 1

表大小,记录数,索引信息如下:

melotpay=# select count(id) from hist_kmoney_record ;
   count   
-----------
 133300926

melotpay=# \dt+ hist_kmoney_record
                                 List of relations
 Schema |        Name        | Type  |  Owner   | Persistence | Size  | Description 
--------+--------------------+-------+----------+-------------+-------+-------------
 public | hist_kmoney_record | table | postgres | permanent   | 22 GB | 

 melotpay=# \d hist_kmoney_record
                      Table "public.hist_kmoney_record"
    Column     |            Type             | Collation | Nullable | Default 
---------------+-----------------------------+-----------+----------+---------
 id            | integer                     |           | not null | 
 user_id       | integer                     |           |          | 
 k_money_value | bigint                      |           |          | 
 channel       | integer                     |           |          | 
 status        | integer                     |           |          | 
 reason        | text                        |           |          | 
 count         | integer                     |           |          | 
 add_time      | timestamp without time zone |           |          | 
 update_time   | timestamp without time zone |           |          | 
 note          | text                        |           |          | 
 uuid          | text                        |           |          | 
Indexes:
    "hist_kmoney_record_pkey" PRIMARY KEY, btree (id)
    "index_hkr_userid_channel" btree (user_id, channel)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29

看一下执行计划

melotpay=# explain select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 10;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..4747.37 rows=10 width=81)
   ->  Index Scan Backward using hist_kmoney_record_pkey on hist_kmoney_record  (cost=0.57..7030008.05 rows=14810 width=81)
         Filter: (user_id = 133423946)
(3 rows)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

执行计划显示走的hist_kmoney_record_pkey索引,也就是id主键的索引,而没有用到user_id的索引,所以过滤user_id的代价很大,导致SQL很慢。
从表信息看到,表还是比较大的,我们知道pg是按代价来评估执行计划的,这里为什么没有用user_id列的索引呢,这里是因为统计信息里没有记录user_id和id的相关性。

查看user_id的线性相关性,如下可见是很离散的,因为该值越接近1,线性越好,就是有可能按主键索引排序后需要的数据可能在很后面,要出结果,对于这样的表那将会很慢。所以这里评估代价生成的执行计划是有问题的。

melotpay=# select correlation  from  pg_stats where tablename ='hist_kmoney_record' and attname='user_id';
 correlation 
-------------
  0.13479742
  • 1
  • 2
  • 3
  • 4

优化(不用id字段的主键索引):

  1. 加大limit值,会使得走id字段索引的评估代价超过走user_id的代价,那么优化器就会选择走user_id字段的索引,如下
melotpay=# explain  select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 1000;
                                               QUERY PLAN                                                
---------------------------------------------------------------------------------------------------------
 Limit  (cost=57185.60..57188.10 rows=1000 width=81)
   ->  Sort  (cost=57185.60..57222.62 rows=14810 width=81)
         Sort Key: id DESC
         ->  Bitmap Heap Scan on hist_kmoney_record  (cost=279.34..56373.58 rows=14810 width=81)
               Recheck Cond: (user_id = 133423946)
               ->  Bitmap Index Scan on index_hkr_userid_channel  (cost=0.00..275.64 rows=14810 width=0)
                     Index Cond: (user_id = 133423946)
(7 rows)

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  1. limit 1000不是我们想要的分页数,那么可以选择在id字段加一个常量,从而达到不走id字段索引的目的
melotpay=# explain analyze select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id+0 desc limit 10;
                                                                    QUERY PLAN                                                                     
---------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=56023.31..56023.33 rows=10 width=85) (actual time=0.020..0.021 rows=1 loops=1)
   ->  Sort  (cost=56023.31..56059.85 rows=14619 width=85) (actual time=0.019..0.020 rows=1 loops=1)
         Sort Key: ((id + 0)) DESC
         Sort Method: quicksort  Memory: 25kB
         ->  Bitmap Heap Scan on hist_kmoney_record  (cost=277.86..55707.40 rows=14619 width=85) (actual time=0.014..0.014 rows=1 loops=1)
               Recheck Cond: (user_id = 133423946)
               Heap Blocks: exact=1
               ->  Bitmap Index Scan on index_hkr_userid_channel  (cost=0.00..274.21 rows=14619 width=0) (actual time=0.009..0.009 rows=1 loops=1)
                     Index Cond: (user_id = 133423946)
 Planning Time: 0.136 ms
 Execution Time: 0.049 ms
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  1. 创建合适的索引,如(user_id,id)组合索引
melotpay=# create index idx_hist_kmoney_record_union on hist_kmoney_record(user_id,id);
CREATE INDEX
melotpay=#  explain analyze select id, user_id, k_money_value, channel, status, add_time, note from hist_kmoney_record where user_id=133423946  order by id desc limit 10;
                                                                               QUERY PLAN                                                                     
           
--------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------
 Limit  (cost=0.57..40.35 rows=10 width=81) (actual time=1.302..1.304 rows=1 loops=1)
   ->  Index Scan Backward using idx_hist_kmoney_record_union on hist_kmoney_record  (cost=0.57..58907.13 rows=14809 width=81) (actual time=1.298..1.300 rows=
1 loops=1)
         Index Cond: (user_id = 133423946)
 Planning Time: 0.090 ms
 Execution Time: 1.330 ms
(5 rows)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

注意:以上方法并不具有普适性,需要考量实际情况,调整对应的方法。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Gausst松鼠会/article/detail/135203
推荐阅读
相关标签
  

闽ICP备14008679号