SQL大致如下

select  *  占用* 代替,实际很多输出字段
   from ast_hazard_record_tf hazard
        where 1=1   and hazard.is_deleted != '0'
    and hazard.state != '02'     AND (crew_path_id LIKE '%'||3801565372||'%' or revicer_id LIKE '%'||043090||'%');

执行计划:

Gather  (cost=1000.00..14114.31 rows=1 width=4095) (actual time=0.379..84.222 rows=43 loops=1)
  Workers Planned: 3
  Workers Launched: 0
  Buffers: shared hit=12352
  ->  Parallel Seq Scan on ast_hazard_record_tf hazard  (cost=0.00..13114.21 rows=1 width=4095) (actual time=0.011..83.756 rows=43 loops=1)
        Filter: (((is_deleted)::text <> '0'::text) AND ((state)::text <> '02'::text) AND (((crew_path_id)::text ~~ '%3801565372%'::text) OR ((revicer_id)::text ~~ '%43090%'::text)))
        Rows Removed by Filter: 118100
        Buffers: shared hit=12352
Planning Time: 0.276 ms
Execution Time: 84.263 ms

独立的 crew_path_id LIKE ‘%’||3801565372||'%'查询

Seq Scan on ast_hazard_record_tf hazard  (cost=0.00..13828.79 rows=1 width=4095) (actual time=0.011..76.313 rows=1718 loops=1)
  Filter: ((crew_path_id)::text ~~ '%3801565372%'::text)
  Rows Removed by Filter: 116425
  Buffers: shared hit=12352
Planning Time: 0.156 ms
Execution Time: 76.414 ms

独立的 revicer_id LIKE ‘%’||043090||'%'查询

Seq Scan on ast_hazard_record_tf hazard  (cost=0.00..13828.79 rows=1 width=4095) (actual time=0.014..78.351 rows=144 loops=1)
  Filter: ((revicer_id)::text ~~ '%43090%'::text)
  Rows Removed by Filter: 117999
  Buffers: shared hit=12352
Planning Time: 0.188 ms
Execution Time: 78.389 ms

分析该SQl性能瓶颈在两处, like %%, 和or 条件。 和业务取得联系,两个or 条件随机出现。可能只有1个,可能2个都有。至于业务层面为啥设计成模糊检索,先不谈论。只针对该场景下,给出最佳优化方案。

解决方案

调研出SQL的性能瓶颈是该表的全表扫描。 而条件crew_path_id LIKE ‘%3801565372%’ 和 vicer_id LIKE '%043090%'都能过滤掉大量数据, 也从业务中了解到该模块的查询结果数据量并不大。令人头疼的是模糊检索以及随机的 or 组合。

通过业务基本情况,需要考虑到“全文检索”和随机的or组合, mogdb 中有gin索引, 正适合当前场景, gin索引。

GIN(Generalized Inverted Index, 通用倒排索引) 是一个存储对(key, posting list)集合的索引结构适用于数组、全文检索等数据类型,可以扩展支持多种的数据类型。
在没有引导列的情况下也能很快的完成检索。

Create extension  if not exists  pg_trgm;
create extension  if not EXISTS  btree_gin;
CREATE INDEX idx_ast_hazard_record_crew_path_revicer_id  ON power_tf.ast_hazard_record_tf  using gin( crew_path_id  gin_trgm_ops, 
revicer_id   gin_trgm_ops  );

SQL效率

EXPLAIN ( ANALYSE true, buffers true  )
select  *   from ast_hazard_record_tf hazard
        where 1=1 and 	hazard.is_deleted != '0'
    and hazard.state != '02'
        AND (crew_path_id LIKE '%'||3801565372||'%' or revicer_id LIKE '%'||043090||'%');
Bitmap Heap Scan on ast_hazard_record_tf hazard  (cost=96.00..100.02 rows=1 width=4095) (actual time=0.636..3.817 rows=43 loops=1)
  Recheck Cond: (((crew_path_id)::text ~~ '%3801565372%'::text) OR ((revicer_id)::text ~~ '%43090%'::text))
  Filter: (((is_deleted)::text <> '0'::text) AND ((state)::text <> '02'::text))
  Rows Removed by Filter: 1765
  Heap Blocks: exact=482
  Buffers: shared hit=506
  ->  BitmapOr  (cost=96.00..96.00 rows=1 width=0) (actual time=0.574..0.575 rows=0 loops=1)
        Buffers: shared hit=24
        ->  Bitmap Index Scan on idx_ast_hazard_record_crew_path_revicer_id  (cost=0.00..68.00 rows=1 width=0) (actual time=0.515..0.515 rows=1718 loops=1)
              Index Cond: ((crew_path_id)::text ~~ '%3801565372%'::text)
              Buffers: shared hit=17
        ->  Bitmap Index Scan on idx_ast_hazard_record_crew_path_revicer_id  (cost=0.00..28.00 rows=1 width=0) (actual time=0.058..0.058 rows=144 loops=1)
              Index Cond: ((revicer_id)::text ~~ '%43090%'::text)
              Buffers: shared hit=7
Planning Time: 0.241 ms
Execution Time: 3.866 ms
  • 独立的like%%查询
EXPLAIN ( ANALYSE true, buffers true  )
 select  *    from ast_hazard_record_tf hazard
   where  crew_path_id LIKE '%'||3801565372||'%';
Bitmap Heap Scan on ast_hazard_record_tf hazard  (cost=68.00..72.01 rows=1 width=4095) (actual time=0.581..3.414 rows=1718 loops=1)
  Recheck Cond: ((crew_path_id)::text ~~ '%3801565372%'::text)
  Heap Blocks: exact=429
  Buffers: shared hit=446
  ->  Bitmap Index Scan on idx_ast_hazard_record_crew_path_revicer_id  (cost=0.00..68.00 rows=1 width=0) (actual time=0.529..0.529 rows=1718 loops=1)
        Index Cond: ((crew_path_id)::text ~~ '%3801565372%'::text)
        Buffers: shared hit=17
Planning Time: 0.197 ms
Execution Time: 3.529 ms
  • 独立的like%%查询
EXPLAIN ( ANALYSE true, buffers true  )
select  *  
   from ast_hazard_record_tf hazard
        where  
          revicer_id LIKE ('%'||043090||'%');
Bitmap Heap Scan on ast_hazard_record_tf hazard  (cost=28.00..32.01 rows=1 width=4095) (actual time=0.096..0.557 rows=144 loops=1)
  Recheck Cond: ((revicer_id)::text ~~ '%43090%'::text)
  Heap Blocks: exact=105
  Buffers: shared hit=112
  ->  Bitmap Index Scan on idx_ast_hazard_record_crew_path_revicer_id  (cost=0.00..28.00 rows=1 width=0) (actual time=0.068..0.069 rows=144 loops=1)
        Index Cond: ((revicer_id)::text ~~ '%43090%'::text)
        Buffers: shared hit=7
Planning Time: 0.297 ms
Execution Time: 0.657 ms

总结

(crew_path_id LIKE '%'||3801565372||'%' or revicer_id LIKE '%'||043090||'%');

随机出现的场景

条件

扫描方式

扫描成本

耗时

两个like%%

全表扫描

12352逻辑读

80ms

crew_path_id LIKE%%

全表扫描

12352逻辑读

80ms

revicer_id LIKE%%

全表扫描

12352逻辑读

80ms

建立

gin(crew_path_id  gin_trgm_ops, revicer_id gin_trgm_ops)

索引后
随机出现的场景

条件

扫描方式

扫描成本

耗时

两个like%%

2次Bitmap Index Scan

506逻辑读

3.8ms

crew_path_id LIKE%%

1次Bitmap Index Scan

446逻辑读

3.5ms

revicer_id LIKE%%

1次 Bitmap Index Scan

112逻辑读

0.6ms

阿里云国内75折 回扣 微信号:monov8
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6