GaussDB DWS执行计划浅析

风晓 2024-01-13 08:35:35  49997 赞同 0 反对 0

分类：资源标签：国产数据库

GaussDB DWS执行计划浅析

关键字

dws的执行计划比pg的要多一些参数，explain除了explain analyze外还有explain performance，三者的差异如下：
EXPLAIN statement：只生成执行计划，不实际执行。其中statement代表SQL语句。
EXPLAIN ANALYZE statement：生成执行计划，进行执行，并显示执行的概要信息。显示中加入了实际的运行时间统计，包括在每个规划节点内部花掉的总时间(以毫秒计)和它实际返回的行数。
EXPLAIN PERFORMANCE statement：生成执行计划，进行执行，并显示执行期间的全部信息。
为了测量运行时在执行计划中每个节点的开销，EXPLAIN ANALYZE或EXPLAIN PERFORMANCE会在当前查询执行上增加性能分析的开销。在一个查询上运行EXPLAIN ANALYZE或EXPLAIN PERFORMANCE有时会比普通查询明显的花费更多的时间。超支的数量依赖于查询的本质和使用的平台。
因此，当定位SQL运行慢问题时，如果SQL长时间运行未结束，建议通过EXPLAIN命令查看执行计划，进行初步定位。如果SQL可以运行出来，则推荐使用EXPLAIN ANALYZE或EXPLAIN PERFORMANCE查看执行计划及其实际的运行信息，以便更精准地定位问题原因。
dws执行计划提供了四种显示格式，分别是：normal，pretty，summary，run。通过设置GUC参数explain_perf_mode，可以显示不同格式的执行计划，其中normal的格式和pg的执行计划接近。pg中也有四种格式，分别是：text，xml，json，yaml。另外就是dws是一个分布式数据库所以执行计划和pg的是有一些区别的，比如协调节点的引入

表访问方式

dws中因为有列存所以，表访问方式中除了常见的seq scan外，还有列存表的访问CStore Scan，列存的索引扫描也一样CStore index Scan等

算子类型	描述
SeqScan	顺序扫描
Cstore Scan	列存表扫描
IndexScan	索引扫描
SubqueryScan	子查询
ValuesScan	values链表
FunctionScan	函数结果集

验证

根据id查询的执行计划，并没有显示走索引，而是显示的‘REMOTE_FQS_QUERY__’，执行计划中包含__REMOTE_FQS_QUERY，表明语句走了FQS（Fast Query Shipping），代表这种场景下，直接下发语句到Datanode执行，而不是在Coordinator上生成执行计划，这种表示该语句无需Datanode间数据交互，例如简单的等值过滤查询，或者是关联列是分布列的关联查询

--1、设置执行计划显示格式为normal
--t1表有100w数据，id有索引
db_zxzh=# set explain_perf_mode=normal;
SET
Time: 2.243 ms

db_zxzh=#   explain analyze select * from t1 where  t1.id = '6513d5e7-52c3-0500-a741-4701e490fffe';
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Data Node Scan on "__REMOTE_FQS_QUERY__"  (cost=0.00..0.00 rows=0 width=0) (actual time=1.038..1.276 rows=1 loops=1)
   Node/s: All datanodes
 Total runtime: 1.458 ms
(3 rows)

Time: 3.831 ms

--2、设置为pretty格式
db_zxzh=# set explain_perf_mode=pretty;
SET
Time: 1.377 ms
db_zxzh=# explain analyze select * from t1 where  t1.id = '6513d5e7-52c3-0500-a741-4701e490fffe';
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
  id |                  operation                   | A-time | A-rows | E-rows | Peak Memory | A-width | E-width | E-costs 
 ----+----------------------------------------------+--------+--------+--------+-------------+---------+---------+---------
   1 | ->  Data Node Scan on "__REMOTE_FQS_QUERY__" | 5.204  |      1 |      0 | 56KB        |         |       0 | 0.00    

        ====== Query Summary =====        
 -----------------------------------------
 Coordinator executor start time: 0.090 ms
 Coordinator executor run time: 5.263 ms
 Coordinator executor end time: 0.019 ms
 Planner runtime: 0.264 ms
 Query Id: 73746443899626020
 Total runtime: 5.453 ms
(12 rows)

Time: 11.411 ms

--3、summary
db_zxzh=# set explain_perf_mode=summary;
SET
Time: 2.547 ms
db_zxzh=# explain analyze select * from t1 where  t1.id = '6513d5e7-52c3-0500-a741-4701e490fffe';
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
  id |                  operation                   | A-time | A-rows | E-rows | Peak Memory | A-width | E-width | E-costs 
 ----+----------------------------------------------+--------+--------+--------+-------------+---------+---------+---------
   1 | ->  Data Node Scan on "__REMOTE_FQS_QUERY__" | 4.371  |      1 |      0 | 56KB        |         |       0 | 0.00    

        ====== Query Summary =====        
 -----------------------------------------
 Coordinator executor start time: 0.033 ms
 Coordinator executor run time: 4.398 ms
 Coordinator executor end time: 0.012 ms
 Planner runtime: 0.108 ms
 Query Id: 73746443899626548
 Total runtime: 4.498 ms
(12 rows)

Time: 6.321 ms

--4、run
db_zxzh=# set explain_perf_mode=run;
SET
Time: 1.411 ms
db_zxzh=# explain analyze select * from t1 where  t1.id = '6513d5e7-52c3-0500-a741-4701e490fffe';
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
  id |                  operation                   | A-time | A-rows | E-rows | Peak Memory | A-width | E-width | E-costs 
 ----+----------------------------------------------+--------+--------+--------+-------------+---------+---------+---------
   1 | ->  Data Node Scan on "__REMOTE_FQS_QUERY__" | 1.690  |      1 |      0 | 56KB        |         |       0 | 0.00    

        ====== Query Summary =====        
 -----------------------------------------
 Coordinator executor start time: 0.048 ms
 Coordinator executor run time: 1.738 ms
 Coordinator executor end time: 0.019 ms
 Planner runtime: 0.133 ms
 Query Id: 73746443899626909
 Total runtime: 1.885 ms
(12 rows)

Time: 4.704 ms

对比pg和gp的执行计划

--对比postgresql的执行计划：
abase=# explain analyze select * from t1 where id = '2344';
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Index Scan using i_t1_id on t1  (cost=0.42..2.44 rows=1 width=51) (actual time=0.426..0.430 rows=0 loops=1)
   Index Cond: ((id)::text = '2344'::text)
 Planning Time: 4.160 ms
 Execution Time: 0.644 ms
(4 rows)

--greenplum：
db_zxzh=# explain analyze select * from t1 where id = '2344';
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Gather Motion 4:1  (slice1; segments: 4)  (cost=0.00..6.00 rows=1 width=608) (actual time=261.975..261.975 rows=0 loops=1)
   ->  Index Scan using i_t1_id on t1  (cost=0.00..6.00 rows=1 width=608) (never executed)
         Index Cond: ((id)::text = '2344'::text)
 Planning time: 89.090 ms
   (slice0)    Executor memory: 60K bytes.
   (slice1)    Executor memory: 60K bytes avg x 4 workers, 60K bytes max (seg0).
 Memory used:  128000kB
 Optimizer: Pivotal Optimizer (GPORCA)
 Execution time: 418.450 ms
(9 rows)

DWS中常见类型计划

GaussDB(DWS)中当前主要存在三类分布式计划：

FQS(fast query shipping)计划
CN直接将原语句下发到DN，各DN单独执行，并将执行结果在CN上进行汇总。
Stream计划
CN根据原语句生成计划并将计划下发给DN进行执行，各DN执行过程中使用Stream算子进行数据交互。
Remote-Query计划
CN生成计划后，将部分原语句下发到DN，各DN单独执行，执行后将结果发送给CN，CN执行剩余计划。

--带有函数情况下不能下推，只能在CN上面计算的情况。--验证两表连接
db_zxzh=# \d+ t1
                                  Table "db_uim.t1"
 Column |          Type          | Modifiers | Storage  | Stats target | Description 
--------+------------------------+-----------+----------+--------------+-------------
 id     | character varying(300) |           | extended |              | 
 c_name | character varying(300) |           | extended |              | 
 c_info | character varying(300) |           | extended |              | 
Indexes:
    "i_t1_id" btree (id) TABLESPACE pg_default
Has OIDs: no
Distribute By: HASH(id)
Location Nodes: ALL DATANODES
Options: orientation=row, compression=no

create table t1(id varchar(300),c_name varchar(300),c_info varchar(300)) distribute by hash(id)

--t1和tt1完全一样，当两个表都是用分布键关联的时候执行计划如下：
db_zxzh=# explain analyze select * from t1,tt1 where t1.id=tt1.id;
                                                        QUERY PLAN                                                        
--------------------------------------------------------------------------------------------------------------------------
 Data Node Scan on "__REMOTE_FQS_QUERY__"  (cost=0.00..0.00 rows=0 width=0) (actual time=329.133..329.133 rows=0 loops=1)
   Node/s: All datanodes
 Total runtime: 329.356 ms
(3 rows)

Time: 331.640 ms


--类型2：当tt1使用非分布键的时候，非fqs计划，部分语句下推，
db_zxzh=# explain select * from t1,tt1 where t1.id=tt1.c_name;
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 Streaming (type: GATHER)  (cost=25501.98..65787.57 rows=1000000 width=102)
   Node/s: All datanodes
   ->  Hash Join  (cost=25497.98..65370.94 rows=1000000 width=102)
         Hash Cond: ((tt1.c_name)::text = (t1.id)::text)
         Skew Join Optimized by Statistic
         ->  Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)  (cost=0.00..35289.64 rows=1000000 width=51)
               Skew Filter(type: ROUNDROBIN): ((c_name)::text = ('张三'::character varying(300))::text COLLATE "default")
               Spawn on: All datanodes
               ->  Seq Scan on tt1  (cost=0.00..6956.33 rows=1000000 width=51)
         ->  Hash  (cost=21331.32..21331.32 rows=999999 width=51)
               ->  Streaming(type: PART LOCAL PART BROADCAST)  (cost=0.00..21331.32 rows=1000000 width=51)
                     Skew Filter(type: BROADCAST): ((id)::text = ('张三'::character varying(300))::text COLLATE "default")
                     Spawn on: All datanodes
                     ->  Seq Scan on t1  (cost=0.00..6956.33 rows=1000000 width=51)
(14 rows)

Time: 7.550 ms

--如果我们新建一张tt2，不加分布键，而是使用默认的分布键是怎么样的呢？这一步是operator是on。和官方文档的类型四一样
--2表join，且连接条件包含非分布列，在开启stream算子（set enable_stream_operator=on）的情况下，会生成stream计划，其dn间存在数据交换，此时对于表tt2，会在各DN进行基表扫描，
--扫描后会通过redistribute stream算子按照join条件中的tt2.id进行hash计算后重新发送给各dn，然后在各dn上做join，最后汇总到CN

create table tt2 as select * from t1;
db_zxzh=# explain select * from t1,tt2 where t1.id=tt2.id;
                                       QUERY PLAN                                       
----------------------------------------------------------------------------------------
 Streaming (type: GATHER)  (cost=11126.99..50579.25 rows=1000000 width=102)
   Node/s: All datanodes
   ->  Hash Join  (cost=11122.99..50162.62 rows=1000000 width=102)
         Hash Cond: ((tt2.id)::text = (t1.id)::text)
         ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..34456.30 rows=1000000 width=51)
               Spawn on: All datanodes
               ->  Seq Scan on tt2  (cost=0.00..6956.33 rows=1000000 width=51)
         ->  Hash  (cost=6956.33..6956.33 rows=999999 width=51)
               ->  Seq Scan on t1  (cost=0.00..6956.33 rows=1000000 width=51)
(9 rows)

--如果我们关闭operator：就和官方文档一样的结果：
db_zxzh=# set enable_stream_operator=off;
SET
db_zxzh=# explain select * from t1,tt2 where t1.id=tt2.id;
                                            QUERY PLAN                                             
---------------------------------------------------------------------------------------------------
 Hash Join  (cost=12500.00..26250.00 rows=1000000 width=102)
   Hash Cond: ((t1.id)::text = (tt2.id)::text)
   ->  Data Node Scan on t1 "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00 rows=1000000 width=51)
         Node/s: All datanodes
   ->  Hash  (cost=0.00..0.00 rows=1000000 width=51)
         ->  Data Node Scan on tt2 "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00 rows=1000000 width=51)
               Node/s: All datanodes
(7 rows)


--类型三：stream计划，dn之间无数据交换，
--两表JOIN，且连接条件为各表的分布列，因此各DN无需数据交换。CN生成stream计划后，将除Gather Stream的计划下发给DN执行，在各个DN上进行基表扫描，并进行哈希连接后，发送给CN。

db_zxzh=# set enable_fast_query_shipping=off;
SET
Time: 1.568 ms
db_zxzh=# explain analyze select * from t1,tt1 where t1.id=tt1.id;
                                                                QUERY PLAN                                                              

----------------------------------------------------------------------------------------------------------------------------------------
--
 Streaming (type: GATHER)  (cost=11126.99..23079.28 rows=1000000 width=102) (actual time=424.650..424.650 rows=0 loops=1)
   Node/s: All datanodes
   ->  Hash Join  (cost=11122.99..22662.65 rows=1000000 width=102) (actual time=[357.606,357.606]..[421.902,421.902], rows=0)
         Hash Cond: ((tt1.id)::text = (t1.id)::text)
         ->  Seq Scan on tt1  (cost=0.00..6956.33 rows=1000000 width=51) (actual time=[0.017,60.788]..[0.024,72.400], rows=1000000)
         ->  Hash  (cost=6956.33..6956.33 rows=999999 width=51) (actual time=[175.209,175.209]..[223.874,223.874], rows=1000000)
               Max Buckets: 524288  Max Batches: 1  Max Memory Usage: 27024kB
               Min Buckets: 524288  Min Batches: 1  Min Memory Usage: 27015kB
               ->  Seq Scan on t1  (cost=0.00..6956.33 rows=1000000 width=51) (actual time=[0.011,68.214]..[0.027,103.418], rows=1000000
)
 Total runtime: 425.044 ms
(10 rows)

Time: 430.293 ms

--带有函数情况下不能下推，只能在CN上面计算的情况。

DWS诊断信息

将explain_perf_mode设置为pretty并且加上varbose的结果：很详细并且打印了sql诊断信息：reason: Function unship_func() can not be shipped

db_zxzh=# explain verbose select unship_func(tt2.id,tt2.c_name) from tt2 join t1 on t1.id = tt2.id;
                                                QUERY PLAN                                                
----------------------------------------------------------------------------------------------------------
  id |                       operation                       | E-rows  | E-distinct | E-width | E-costs  
 ----+-------------------------------------------------------+---------+------------+---------+----------
   1 | ->  Hash Join (2,3)                                   | 1000000 |            |      44 | 33750.00 
   2 |    ->  Data Node Scan on tt2 "_REMOTE_TABLE_QUERY_"   | 1000000 |            |      44 | 0.00     
   3 |    ->  Hash                                           | 1000000 |            |      37 | 0.00     
   4 |       ->  Data Node Scan on t1 "_REMOTE_TABLE_QUERY_" | 1000000 |            |      37 | 0.00     

             SQL Diagnostic Information            
 --------------------------------------------------
 SQL is not plan-shipping
         reason: Function unship_func() can not be shipped

    Predicate Information (identified by plan id)   
 ---------------------------------------------------
   1 --Hash Join (2,3)
         Hash Cond: ((tt2.id)::text = (t1.id)::text)

             Targetlist Information (identified by plan id)             
 -----------------------------------------------------------------------
   1 --Hash Join (2,3)
         Output: ((tt2.id)::integer + (tt2.c_name)::integer)
   2 --Data Node Scan on tt2 "_REMOTE_TABLE_QUERY_"
         Output: tt2.id, tt2.c_name
         Node/s: All datanodes
         Remote query: SELECT id, c_name FROM ONLY db_uim.tt2 WHERE true
   3 --Hash
         Output: t1.id
   4 --Data Node Scan on t1 "_REMOTE_TABLE_QUERY_"
         Output: t1.id
         Node/s: All datanodes
         Remote query: SELECT id FROM ONLY db_uim.t1 WHERE true
(31 rows)

--不加verbose
db_zxzh=# explain  select unship_func(tt2.id,tt2.c_name) from tt2 join t1 on t1.id = tt2.id;
                                         QUERY PLAN                                          
---------------------------------------------------------------------------------------------
  id |                       operation                       | E-rows  | E-width | E-costs  
 ----+-------------------------------------------------------+---------+---------+----------
   1 | ->  Hash Join (2,3)                                   | 1000000 |      44 | 33750.00 
   2 |    ->  Data Node Scan on tt2 "_REMOTE_TABLE_QUERY_"   | 1000000 |      44 | 0.00     
   3 |    ->  Hash                                           | 1000000 |      37 | 0.00     
   4 |       ->  Data Node Scan on t1 "_REMOTE_TABLE_QUERY_" | 1000000 |      37 | 0.00     

    Predicate Information (identified by plan id)   
 ---------------------------------------------------
   1 --Hash Join (2,3)
         Hash Cond: ((tt2.id)::text = (t1.id)::text)
(11 rows)


--如果将explain_perf_mode改成normal，无论加不加verbose都不会打印诊断信息
db_zxzh=# set explain_perf_mode=normal;
SET
db_zxzh=# explain  select unship_func(tt2.id,tt2.c_name) from tt2 join t1 on t1.id = tt2.id;
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Hash Join  (cost=12500.00..33750.00 rows=1000000 width=44)
   Hash Cond: ((tt2.id)::text = (t1.id)::text)
   ->  Data Node Scan on tt2 "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00 rows=1000000 width=44)
         Node/s: All datanodes
   ->  Hash  (cost=0.00..0.00 rows=1000000 width=37)
         ->  Data Node Scan on t1 "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00 rows=1000000 width=37)
               Node/s: All datanodes
(7 rows)

db_zxzh=# explain verbose select unship_func(tt2.id,tt2.c_name) from tt2 join t1 on t1.id = tt2.id;
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Hash Join  (cost=12500.00..33750.00 rows=1000000 width=44)
   Output: ((tt2.id)::integer + (tt2.c_name)::integer)
   Hash Cond: ((tt2.id)::text = (t1.id)::text)
   ->  Data Node Scan on tt2 "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00 rows=1000000 width=44)
         Output: tt2.id, tt2.c_name
         Node/s: All datanodes
         Remote query: SELECT id, c_name FROM ONLY db_uim.tt2 WHERE true
   ->  Hash  (cost=0.00..0.00 rows=1000000 width=37)
         Output: t1.id
         ->  Data Node Scan on t1 "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00 rows=1000000 width=37)
               Output: t1.id
               Node/s: All datanodes
               Remote query: SELECT id FROM ONLY db_uim.t1 WHERE true
(13 rows)

详细信息performance

如果我们将db_zxzh=# set explain_perf_mode=summary;，
可以打印更详细的内容，并且使用explain performance ,就可以看到更详细的内容其中有A-time,和E-rows等，其中a开头代表A-rows（Actual Rows）：这是实际返回的行数，E开头代表：E-memory（Estimated Memory）：这是估计的内存使用量

db_zxzh=# explain performance select * from tt2 join t1 on t1.id= tt2.id;
                                                                             QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
  id |                operation                |       A-time       | A-rows  | E-rows  | E-distinct |  Peak Memory   |   E-memory   | A-width | E-width | E-costs  
 ----+-----------------------------------------+--------------------+---------+---------+------------+----------------+--------------+---------+---------+----------
   1 | ->  Streaming (type: GATHER)            | 1043.218           | 1000000 | 1000000 |            | 56KB           |              |         |     102 | 50579.25 
   2 |    ->  Hash Join (3,5)                  | [701.128, 730.338] | 1000000 | 1000000 |            | [16KB, 16KB]   | 1MB          |         |     102 | 50162.62 
   3 |       ->  Streaming(type: REDISTRIBUTE) | [78.371, 139.415]  | 1000000 | 1000000 | 333333     | [128KB, 128KB] | 2MB          |         |      51 | 34456.30 
   4 |          ->  Seq Scan on db_uim.tt2     | [61.673, 75.716]   | 1000000 | 1000000 |            | [32KB, 32KB]   | 1MB          |         |      51 | 6956.33  
   5 |       ->  Hash                          | [190.192, 237.935] | 1000000 |  999999 | 333333     | [32MB, 32MB]   | 38MB(4435MB) | [67,67] |      51 | 6956.33  
   6 |          ->  Seq Scan on db_uim.t1      | [68.364, 110.741]  | 1000000 | 1000000 |            | [16KB, 16KB]   | 1MB          |         |      51 | 6956.33  

    Predicate Information (identified by plan id)   
 ---------------------------------------------------
   2 --Hash Join (3,5)
         Hash Cond: ((tt2.id)::text = (t1.id)::text)

--如果不加performance，执行计划在原有的基础上显示详细信息
db_zxzh=# set explain_perf_mode=normal;
SET
db_zxzh=# explain performance select * from tt2 join t1 on t1.id= tt2.id;
                                                                              QUERY PLAN                                                                              
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Streaming (type: GATHER) (primary node count=0, node count=3)  (cost=11126.99..50579.25 rows=1000000 width=102) (actual time=222.153..1111.734 rows=1000000 loops=1)
   Output: tt2.id, tt2.c_name, tt2.c_info, t1.id, t1.c_name, t1.c_info
   Node/s: All datanodes
   (Buffers: shared hit=1)
   (CPU: ex c/r=106, ex row=1000000, ex cyc=106589111, inc cyc=106589111)
   ->  Hash Join  (cost=11122.99..50162.62 rows=1000000 width=102)
         Output: tt2.id, tt2.c_name, tt2.c_info, t1.id, t1.c_name, t1.c_info
         Hash Cond: ((tt2.id)::text = (t1.id)::text)
          dn_6001_6002 (actual time=219.450..740.547 rows=333282 loops=1) (projection time=32.635)
          dn_6003_6004 (actual time=217.816..762.811 rows=333393 loops=1) (projection time=32.622)
          dn_6005_6006 (actual time=241.365..732.904 rows=333325 loops=1) (projection time=32.644)
          dn_6001_6002 (Buffers: shared hit=3686)
          dn_6003_6004 (Buffers: shared hit=3688)
          dn_6005_6006 (Buffers: shared hit=3687)
          dn_6001_6002 (CPU: ex c/r=62, ex row=666564, ex cyc=41363381, inc cyc=72562784)
          dn_6003_6004 (CPU: ex c/r=69, ex row=666786, ex cyc=46154058, inc cyc=74769920)
          dn_6005_6006 (CPU: ex c/r=59, ex row=666650, ex cyc=39981994, inc cyc=71790063)
         ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..34456.30 rows=1000000 distinct=333333 width=51)
               Output: tt2.id, tt2.c_name, tt2.c_info
               Distribute Key: tt2.id
               Spawn on: All datanodes
               Consumer Nodes: All datanodes
                dn_6001_6002 (actual time=49.834..157.660 rows=333282 loops=1)
                dn_6003_6004 (actual time=0.139..84.954 rows=333393 loops=1)
                dn_6005_6006 (actual time=0.132..93.302 rows=333325 loops=1)
                dn_6001_6002 (Buffers: 0)
                dn_6003_6004 (Buffers: 0)
                dn_6005_6006 (Buffers: 0)
                dn_6001_6002 (CPU: ex c/r=42, ex row=333282, ex cyc=14303409, inc cyc=14303409)
                dn_6003_6004 (CPU: ex c/r=20, ex row=333393, ex cyc=6982847, inc cyc=6982847)
                dn_6005_6006 (CPU: ex c/r=23, ex row=333325, ex cyc=7819311, inc cyc=7819311)
               ->  Seq Scan on db_uim.tt2  (cost=0.00..6956.33 rows=1000000 width=51)
                     Output: tt2.id, tt2.c_name, tt2.c_info
                      dn_6001_6002 (actual time=0.015..81.603 rows=333333 loops=1)
                      dn_6003_6004 (actual time=0.009..65.472 rows=333334 loops=1)
                      dn_6005_6006 (actual time=0.015..81.391 rows=333333 loops=1)
                      dn_6001_6002 (Buffers: shared hit=3623)
                      dn_6003_6004 (Buffers: shared hit=3623)
                      dn_6005_6006 (Buffers: shared hit=3623)
                      dn_6001_6002 (CPU: ex c/r=20, ex row=333333, ex cyc=6671985, inc cyc=6671985)
                      dn_6003_6004 (CPU: ex c/r=15, ex row=333334, ex cyc=5035708, inc cyc=5035708)
                      dn_6005_6006 (CPU: ex c/r=19, ex row=333333, ex cyc=6618734, inc cyc=6618734)
         ->  Hash  (cost=6956.33..6956.33 rows=999999 distinct=333333 width=51)
               Output: t1.id, t1.c_name, t1.c_info
                dn_6001_6002 (actual time=168.960..168.960 rows=333282 loops=1)
                dn_6003_6004 (actual time=216.331..216.331 rows=333393 loops=1)
                dn_6005_6006 (actual time=239.887..239.887 rows=333325 loops=1)
                dn_6001_6002 Buckets: 524288  Batches: 1  Memory Usage: 27015kB
                dn_6003_6004 Buckets: 524288  Batches: 1  Memory Usage: 27024kB
                dn_6005_6006 Buckets: 524288  Batches: 1  Memory Usage: 27018kB
                dn_6001_6002 (Buffers: shared hit=3622)
                dn_6003_6004 (Buffers: shared hit=3624)
                dn_6005_6006 (Buffers: shared hit=3623)
                dn_6001_6002 (CPU: ex c/r=35, ex row=333282, ex cyc=11989498, inc cyc=16895994)
                dn_6003_6004 (CPU: ex c/r=48, ex row=333393, ex cyc=16134097, inc cyc=21633015)
                dn_6005_6006 (CPU: ex c/r=48, ex row=333325, ex cyc=16045461, inc cyc=23988758)
               ->  Seq Scan on db_uim.t1  (cost=0.00..6956.33 rows=1000000 width=51)
                     Output: t1.id, t1.c_name, t1.c_info
                     Distribute Key: t1.id
                      dn_6001_6002 (actual time=0.016..64.829 rows=333282 loops=1)
                      dn_6003_6004 (actual time=0.034..73.310 rows=333393 loops=1)
                      dn_6005_6006 (actual time=0.048..96.832 rows=333325 loops=1)
                      dn_6001_6002 (Buffers: shared hit=3622)
                      dn_6003_6004 (Buffers: shared hit=3624)
                      dn_6005_6006 (Buffers: shared hit=3623)
                      dn_6001_6002 (CPU: ex c/r=14, ex row=333282, ex cyc=4906496, inc cyc=4906496)
                      dn_6003_6004 (CPU: ex c/r=16, ex row=333393, ex cyc=5498918, inc cyc=5498918)
                      dn_6005_6006 (CPU: ex c/r=23, ex row=333325, ex cyc=7943297, inc cyc=7943297)
 Total runtime: 1198.308 ms
(69 rows)

不足？

比较遗憾的是当显示'REMOTE_FQS_QUERY'时单表查询的时候dws并不能从执行计划中直接看出是否走了索引：pg_stat_all_index里面所有的数据都是0，所以没法统计是否使用到索引

postgres=#  explain (analyze,buffers,verbose,cpu,nodes )select * from t1 where id = '1';
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Data Node Scan on "__REMOTE_FQS_QUERY__"  (cost=0.00..0.00 rows=0 width=0) (actual time=91.061..127.316 rows=8 loops=1)
   Output: t1.id, t1.c_name, t1.c_info
   Node/s: All datanodes
   Remote query: SELECT id, c_name, c_info FROM sa.t1 WHERE id::text = '1'::text
   (CPU: ex c/r=1591454, ex row=8, ex cyc=12731636, inc cyc=12731636)
 Total runtime: 127.485 ms
(6 rows)

--创建索引后
postgres=# create index i_t1_id on t1(id);
CREATE INDEX

--下面是建立索引后走索引的执行计划和上面没有走索引的执行计划完全一样，只是时间上不一样，没有其他区别。
postgres=# set explain_perf_mode=normal;
SET
postgres=# explain (analyze,buffers,verbose,cpu,nodes )select * from t1 where id = '1';
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Data Node Scan on "__REMOTE_FQS_QUERY__"  (cost=0.00..0.00 rows=0 width=0) (actual time=1.522..1.927 rows=8 loops=1)
   Output: t1.id, t1.c_name, t1.c_info
   Node/s: All datanodes
   Remote query: SELECT id, c_name, c_info FROM sa.t1 WHERE id::text = '1'::text
   (CPU: ex c/r=24064, ex row=8, ex cyc=192517, inc cyc=192517)
 Total runtime: 2.119 ms
(6 rows)

--连表查询并且关联键不都是分布键的时候，t1的分布键是hash分布id，而t2表没有加分布键，是随机分布。这样就可以看到是否走了索引：
postgres=# create index i_t1_id on t1(id);
CREATE INDEX
postgres=# explain (analyze,buffers,verbose,cpu,nodes )select * from t1 join t2 on t1.id=t2.id where t1.id='650bb8387c5008006527fa25e490fffe';
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Streaming (type: GATHER)  (cost=4.46..25.38 rows=8 width=73) (actual time=62.472..62.472 rows=0 loops=1)
   Output: t1.id, t1.c_name, t1.c_info, t2.id, t2.c_name
   Node/s: All datanodes
   (Buffers: shared hit=1)
   (CPU: ex c/r=0, ex row=0, ex cyc=6247153, inc cyc=6247153)
   ->  Nested Loop  (cost=4.28..17.38 rows=8 width=73) (actual time=[47.783,47.783]..[57.506,57.506], rows=0)
         Output: t1.id, t1.c_name, t1.c_info, t2.id, t2.c_name
         CPU: max_ex c/r=0, max_ex cyc=3019, max_inc cyc=5750573
              min_ex c/r=0, min_ex cyc=1657, min_inc cyc=4778272
         ->  Streaming(type: BROADCAST)  (cost=0.00..1.28 rows=3 width=43) (actual time=[1.629,47.648]..[49.679,57.300], rows=3)
               Output: t2.id, t2.c_name
               Spawn on: All datanodes
               Consumer Nodes: All datanodes
               CPU: max_ex c/r=5729986, max_ex cyc=5729986, max_inc cyc=5729986
                    min_ex c/r=4764689, min_ex cyc=4764689, min_inc cyc=4764689
               ->  Seq Scan on sa.t2  (cost=0.00..1.01 rows=1 width=43) (actual time=[0.002,0.002]..[0.209,0.211], rows=1)
                     Output: t2.id, t2.c_name
                     Filter: ((t2.id)::text = '650bb8387c5008006527fa25e490fffe'::text)
                     Buffers: shared max hit=18
                     CPU: max_ex c/r=21024, max_ex cyc=21024, max_inc cyc=21024
                          min_ex c/r=0, min_ex cyc=230, min_inc cyc=230
         ->  Bitmap Heap Scan on sa.t1  (cost=4.27..16.06 rows=8 width=30) (actual time=[0.119,0.119]..[0.211,0.211], rows=0)
               Output: t1.id, t1.c_name, t1.c_info
               Recheck Cond: ((t1.id)::text = '650bb8387c5008006527fa25e490fffe'::text)
               Buffers: shared max read=4 min  read=4
               CPU: max_ex c/r=0, max_ex cyc=602, max_inc cyc=21173
                    min_ex c/r=0, min_ex cyc=398, min_inc cyc=11926
               ->  Bitmap Index Scan on i_t1_id  (cost=0.00..4.27 rows=8 width=0) (actual time=[0.115,0.115]..[0.206,0.206], rows=0)
                     Index Cond: ((t1.id)::text = '650bb8387c5008006527fa25e490fffe'::text)
                     Buffers: shared max read=4 min  read=4
                     CPU: max_ex c/r=0, max_ex cyc=20571, max_inc cyc=20571
                          min_ex c/r=0, min_ex cyc=11528, min_inc cyc=11528
 Total runtime: 63.293 ms
(33 rows)

分析REMOTE_FSQ_QUERY

没有索引的情况下：

create table t5 (bh varchar(300),bh2 varchar(300),c_name varchar(300),c_info varchar(300))distribute by hash(bh);
insert into t4 select  uuid_generate_v1(), uuid_generate_v1(),'测试','sdfffffffffffffffsdf' from   generate_series(1,50000);
insert into t4 select * from t4;

--没有索引的情况下：
postgres=# explain analyze select *  from t4 where bh2  = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
  id |                  operation                   | A-time  | A-rows | E-rows | Peak Memory | A-width | E-width | E-costs 
 ----+----------------------------------------------+---------+--------+--------+-------------+---------+---------+---------
   1 | ->  Data Node Scan on "__REMOTE_FQS_QUERY__" | 256.364 |     32 |      0 | 56KB        |         |       0 | 0.00    

        ====== Query Summary =====        
 -----------------------------------------
 Coordinator executor start time: 0.055 ms
 Coordinator executor run time: 256.410 ms
 Coordinator executor end time: 0.010 ms
 Planner runtime: 0.145 ms
 Query Id: 73746443917091633
 Total runtime: 256.557 ms
(12 rows)

Time: 259.051 ms


--添加索引：
postgres=# create index i_t4 on t4(bh2);  
CREATE INDEX
Time: 3328.258 ms
postgres=# explain analyze select /*+ indexscan(t4 i_t4) */ * from t4 where bh2  = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
  id |                  operation                   | A-time | A-rows | E-rows | Peak Memory | A-width | E-width | E-costs 
 ----+----------------------------------------------+--------+--------+--------+-------------+---------+---------+---------
   1 | ->  Data Node Scan on "__REMOTE_FQS_QUERY__" | 2.269  |     32 |      0 | 56KB        |         |       0 | 0.00    

        ====== Query Summary =====        
 -----------------------------------------
 Coordinator executor start time: 0.027 ms
 Coordinator executor run time: 2.298 ms
 Coordinator executor end time: 0.009 ms
 Planner runtime: 0.074 ms
 Query Id: 73746443917091930
 Total runtime: 2.401 ms
(12 rows)

执行计划过于简洁，并没有打印出是否走了索引，实际上有索引比没有索引快很多。执行计划没法直观的看到，并且pg_stat_all_indexes中业务表的所有统计信息都是0，也没法判断。
但是！enable_fast_query_shipping可以控制。这种查询由于语句较为简单，不需要在cn上生成执行计划，如果关闭该参数则会在cn上面生成执行计划，示例如下

--__REMOTE_FQS_QUERY__是直接将语句发送给了nodedata，所以cn节点不生成执行计划，所以没法看到是否走索引，
--如果我们将enable_fast_query_shipping关闭，就能在cn上面生成执行计划，可以看到是否走了索引。
--关闭参数
postgres=# set enable_fast_query_shipping to off;
SET
Time: 1.349 ms
--将格式设置为normal
postgres=# set explain_perf_mode=normal;
SET
Time: 2.693 ms
--可以看到走全表扫描
postgres=# explain analyze select *  from t4 where bh2  = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Streaming (type: GATHER)  (cost=0.62..31755.34 rows=31 width=102) (actual time=255.232..334.125 rows=32 loops=1)
   Node/s: All datanodes
   ->  Seq Scan on t4  (cost=0.00..31747.34 rows=31 width=102) (actual time=[0.049,253.901]..[332.507,332.507], rows=32)
         Filter: ((bh2)::text = '652e4e0e-ba60-0400-25b5-4ee5e490fffe'::text)
         Rows Removed by Filter: 3199968
 Total runtime: 334.433 ms
(6 rows)

Time: 336.489 ms
postgres=# 
postgres=# create index i_t4 on t4(bh2); 
CREATE INDEX
Time: 3852.153 ms
--添加索引后可以看到走索引了
postgres=# explain analyze select *  from t4 where bh2  = '652e4e0e-ba60-0400-25b5-4ee5e490fffe';
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Streaming (type: GATHER)  (cost=4.95..51.75 rows=31 width=102) (actual time=2.988..11.482 rows=32 loops=1)
   Node/s: All datanodes
   ->  Bitmap Heap Scan on t4  (cost=4.33..43.75 rows=31 width=102) (actual time=[0.138,0.138]..[0.305,0.455], rows=32)
         Recheck Cond: ((bh2)::text = '652e4e0e-ba60-0400-25b5-4ee5e490fffe'::text)
         ->  Bitmap Index Scan on i_t4  (cost=0.00..4.33 rows=31 width=0) (actual time=[0.133,0.133]..[0.265,0.265], rows=32)
               Index Cond: ((bh2)::text = '652e4e0e-ba60-0400-25b5-4ee5e490fffe'::text)
 Total runtime: 12.014 ms
(7 rows)

Time: 15.506 ms

FQS（Fast Query Shipping）执行方式即完全下推语句：下发语句到各DN的执行方式，该种执行方式未使用分布式执行框架，适用于各DN间无数据交互的场景。
FQS计划收参数enable_fast_query_shipping控制，设置为“on”，表示允许FQS计划，设置为off表示使用Stream计划。此外enable_stream_operator参数表示是否允许使用Stream计划

总结

总的来说dws的执行计划在原有的基础上加了一些参数和功能，组合情况更加丰富
排查过程中发现统计信息表pg_stat_all_indexes，pg_stat_all_tables中idx_scan的数据都是0，dws也没有做修改
官方文档说明带有REMOTE_FQS_QUERY的执行计划通常不需要调优，但是REMOTE_FQS_QUERY并不能直接看出是否走了索引，验证发现有索引效率是要高一些，在某些场景下可能会用到。
当执行计划中带有REMOTE_FQS_QUERY，可以通过enable_fast_query_shipping控制是否在cn节点生成执行计划查看是否用到了索引
dws和gp一样默认也是关闭了merge join和nested loop join，默认都建议使用hash join

如果您发现该资源为电子书等存在侵权的资源或对该资源描述不正确等，可点击“私信”按钮向作者进行反馈；如作者无回复可进行平台仲裁，我们会在第一时间进行处理！

评价 0 条

GaussDB DWS执行计划浅析

GaussDB DWS执行计划浅析

关键字

表访问方式

验证

DWS中常见类型计划

DWS诊断信息

详细信息performance

不足？

分析REMOTE_FSQ_QUERY

总结

相关资源

关注我们