postgresql sql优化 gin多列索引 对 or 条件的优化
阿里云国内75折 回扣 微信号:monov8 |
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6 |
SQL大致如下
select * 占用* 代替,实际很多输出字段
from ast_hazard_record_tf hazard
where 1=1 and hazard.is_deleted != '0'
and hazard.state != '02' AND (crew_path_id LIKE '%'||3801565372||'%' or revicer_id LIKE '%'||043090||'%');
执行计划:
Gather (cost=1000.00..14114.31 rows=1 width=4095) (actual time=0.379..84.222 rows=43 loops=1)
Workers Planned: 3
Workers Launched: 0
Buffers: shared hit=12352
-> Parallel Seq Scan on ast_hazard_record_tf hazard (cost=0.00..13114.21 rows=1 width=4095) (actual time=0.011..83.756 rows=43 loops=1)
Filter: (((is_deleted)::text <> '0'::text) AND ((state)::text <> '02'::text) AND (((crew_path_id)::text ~~ '%3801565372%'::text) OR ((revicer_id)::text ~~ '%43090%'::text)))
Rows Removed by Filter: 118100
Buffers: shared hit=12352
Planning Time: 0.276 ms
Execution Time: 84.263 ms
独立的 crew_path_id LIKE ‘%’||3801565372||'%'查询
Seq Scan on ast_hazard_record_tf hazard (cost=0.00..13828.79 rows=1 width=4095) (actual time=0.011..76.313 rows=1718 loops=1)
Filter: ((crew_path_id)::text ~~ '%3801565372%'::text)
Rows Removed by Filter: 116425
Buffers: shared hit=12352
Planning Time: 0.156 ms
Execution Time: 76.414 ms
独立的 revicer_id LIKE ‘%’||043090||'%'查询
Seq Scan on ast_hazard_record_tf hazard (cost=0.00..13828.79 rows=1 width=4095) (actual time=0.014..78.351 rows=144 loops=1)
Filter: ((revicer_id)::text ~~ '%43090%'::text)
Rows Removed by Filter: 117999
Buffers: shared hit=12352
Planning Time: 0.188 ms
Execution Time: 78.389 ms
分析该SQl性能瓶颈在两处, like %%, 和or 条件。 和业务取得联系,两个or 条件随机出现。可能只有1个,可能2个都有。至于业务层面为啥设计成模糊检索,先不谈论。只针对该场景下,给出最佳优化方案。
解决方案
调研出SQL的性能瓶颈是该表的全表扫描。 而条件crew_path_id LIKE ‘%3801565372%’ 和 vicer_id LIKE '%043090%'都能过滤掉大量数据, 也从业务中了解到该模块的查询结果数据量并不大。令人头疼的是模糊检索以及随机的 or 组合。
通过业务基本情况,需要考虑到“全文检索”和随机的or组合, mogdb 中有gin索引, 正适合当前场景, gin索引。
GIN(Generalized Inverted Index, 通用倒排索引) 是一个存储对(key, posting list)集合的索引结构适用于数组、全文检索等数据类型,可以扩展支持多种的数据类型。
在没有引导列的情况下也能很快的完成检索。
Create extension if not exists pg_trgm;
create extension if not EXISTS btree_gin;
CREATE INDEX idx_ast_hazard_record_crew_path_revicer_id ON power_tf.ast_hazard_record_tf using gin( crew_path_id gin_trgm_ops,
revicer_id gin_trgm_ops );
SQL效率
EXPLAIN ( ANALYSE true, buffers true )
select * from ast_hazard_record_tf hazard
where 1=1 and hazard.is_deleted != '0'
and hazard.state != '02'
AND (crew_path_id LIKE '%'||3801565372||'%' or revicer_id LIKE '%'||043090||'%');
Bitmap Heap Scan on ast_hazard_record_tf hazard (cost=96.00..100.02 rows=1 width=4095) (actual time=0.636..3.817 rows=43 loops=1)
Recheck Cond: (((crew_path_id)::text ~~ '%3801565372%'::text) OR ((revicer_id)::text ~~ '%43090%'::text))
Filter: (((is_deleted)::text <> '0'::text) AND ((state)::text <> '02'::text))
Rows Removed by Filter: 1765
Heap Blocks: exact=482
Buffers: shared hit=506
-> BitmapOr (cost=96.00..96.00 rows=1 width=0) (actual time=0.574..0.575 rows=0 loops=1)
Buffers: shared hit=24
-> Bitmap Index Scan on idx_ast_hazard_record_crew_path_revicer_id (cost=0.00..68.00 rows=1 width=0) (actual time=0.515..0.515 rows=1718 loops=1)
Index Cond: ((crew_path_id)::text ~~ '%3801565372%'::text)
Buffers: shared hit=17
-> Bitmap Index Scan on idx_ast_hazard_record_crew_path_revicer_id (cost=0.00..28.00 rows=1 width=0) (actual time=0.058..0.058 rows=144 loops=1)
Index Cond: ((revicer_id)::text ~~ '%43090%'::text)
Buffers: shared hit=7
Planning Time: 0.241 ms
Execution Time: 3.866 ms
- 独立的like%%查询
EXPLAIN ( ANALYSE true, buffers true )
select * from ast_hazard_record_tf hazard
where crew_path_id LIKE '%'||3801565372||'%';
Bitmap Heap Scan on ast_hazard_record_tf hazard (cost=68.00..72.01 rows=1 width=4095) (actual time=0.581..3.414 rows=1718 loops=1)
Recheck Cond: ((crew_path_id)::text ~~ '%3801565372%'::text)
Heap Blocks: exact=429
Buffers: shared hit=446
-> Bitmap Index Scan on idx_ast_hazard_record_crew_path_revicer_id (cost=0.00..68.00 rows=1 width=0) (actual time=0.529..0.529 rows=1718 loops=1)
Index Cond: ((crew_path_id)::text ~~ '%3801565372%'::text)
Buffers: shared hit=17
Planning Time: 0.197 ms
Execution Time: 3.529 ms
- 独立的like%%查询
EXPLAIN ( ANALYSE true, buffers true )
select *
from ast_hazard_record_tf hazard
where
revicer_id LIKE ('%'||043090||'%');
Bitmap Heap Scan on ast_hazard_record_tf hazard (cost=28.00..32.01 rows=1 width=4095) (actual time=0.096..0.557 rows=144 loops=1)
Recheck Cond: ((revicer_id)::text ~~ '%43090%'::text)
Heap Blocks: exact=105
Buffers: shared hit=112
-> Bitmap Index Scan on idx_ast_hazard_record_crew_path_revicer_id (cost=0.00..28.00 rows=1 width=0) (actual time=0.068..0.069 rows=144 loops=1)
Index Cond: ((revicer_id)::text ~~ '%43090%'::text)
Buffers: shared hit=7
Planning Time: 0.297 ms
Execution Time: 0.657 ms
总结
(crew_path_id LIKE '%'||3801565372||'%' or revicer_id LIKE '%'||043090||'%');
随机出现的场景
条件 | 扫描方式 | 扫描成本 | 耗时 |
两个like%% | 全表扫描 | 12352逻辑读 | 80ms |
crew_path_id LIKE%% | 全表扫描 | 12352逻辑读 | 80ms |
revicer_id LIKE%% | 全表扫描 | 12352逻辑读 | 80ms |
建立
gin(crew_path_id gin_trgm_ops, revicer_id gin_trgm_ops)
索引后
随机出现的场景
条件 | 扫描方式 | 扫描成本 | 耗时 |
两个like%% | 2次Bitmap Index Scan | 506逻辑读 | 3.8ms |
crew_path_id LIKE%% | 1次Bitmap Index Scan | 446逻辑读 | 3.5ms |
revicer_id LIKE%% | 1次 Bitmap Index Scan | 112逻辑读 | 0.6ms |
阿里云国内75折 回扣 微信号:monov8 |
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6 |