并行查询

这个是PostgreSQL 9.6才开始有的。

测试环境是4核心的i5CPU,其他配置为PostgreSQL默认.

测试数据

[local]:5432 sky@sky=# CREATE TABLE testcount(id int);
CREATE TABLE
Time: 2.733 ms
[local]:5432 sky@sky=# INSERT INTO testcount VALUES (generate_series(1,15000000));
INSERT 0 15000000
Time: 11145.722 ms
[local]:5432 sky@sky=# VACUUM FULL ;
VACUUM
Time: 8084.445 ms

非并行查询

[local]:5432 sky@sky=# set max_parallel_workers_per_gather = 0;
SET
Time: 0.164 ms

[local]:5432 sky@sky=# EXPLAIN (analyze, verbose, buffers) select count(1) from testcount ;
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=277932.75..277932.76 rows=1 width=8) (actual time=1982.173..1982.173 rows=1 loops=1)
   Output: count(1)
   Buffers: shared hit=896 read=65476
   ->  Seq Scan on public.testcount  (cost=0.00..235620.60 rows=16924860 width=0) (actual time=0.022..884.887 rows=15000000 loops=1)
         Output: id
         Buffers: shared hit=896 read=65476
 Planning time: 0.029 ms
 Execution time: 1982.196 ms
(8 rows)


[local]:5432 sky@sky=# SELECT count(1) from testcount ;
  count   
----------
 15000000
(1 row)

Time: 1005.757 ms

开启并行查询

开启1个worker

[local]:5432 sky@sky=# set max_parallel_workers_per_gather = 1;
SET
Time: 0.131 ms
[local]:5432 sky@sky=# EXPLAIN (analyze, verbose, buffers) select count(1) from testcount ;
                                                                      QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=191819.61..191819.62 rows=1 width=8) (actual time=1016.820..1016.820 rows=1 loops=1)
   Output: count(1)
   Buffers: shared hit=1326 read=65124
   ->  Gather  (cost=191819.50..191819.61 rows=1 width=8) (actual time=1016.761..1016.817 rows=2 loops=1)
         Output: (PARTIAL count(1))
         Workers Planned: 1
         Workers Launched: 1
         Buffers: shared hit=1326 read=65124
         ->  Partial Aggregate  (cost=190819.50..190819.51 rows=1 width=8) (actual time=1015.814..1015.814 rows=1 loops=2)
               Output: PARTIAL count(1)
               Buffers: shared hit=1248 read=65124
               Worker 0: actual time=1014.972..1014.972 rows=1 loops=1
                 Buffers: shared hit=595 read=32170
               ->  Parallel Seq Scan on public.testcount  (cost=0.00..165930.00 rows=9955800 width=0) (actual time=0.029..500.854 rows=7500000 loops=2)
                     Buffers: shared hit=1248 read=65124
                     Worker 0: actual time=0.027..491.502 rows=7404818 loops=1
                       Buffers: shared hit=595 read=32170
 Planning time: 0.033 ms
 Execution time: 1017.595 ms
(19 rows)

[local]:5432 sky@sky=# SELECT count(1) from testcount ;
  count   
----------
 15000000
(1 row)

Time: 660.512 m

开启2个worker

[local]:5432 sky@sky=# set max_parallel_workers_per_gather = 2;
SET
Time: 0.125 ms
[local]:5432 sky@sky=# EXPLAIN (analyze, verbose, buffers) select count(1) from testcount ;

                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=155522.53..155522.54 rows=1 width=8) (actual time=766.411..766.411 rows=1 loops=1)
   Output: count(1)
   Buffers: shared hit=1532 read=64996
   ->  Gather  (cost=155522.31..155522.52 rows=2 width=8) (actual time=766.365..766.406 rows=3 loops=1)
         Output: (PARTIAL count(1))
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=1532 read=64996
         ->  Partial Aggregate  (cost=154522.31..154522.32 rows=1 width=8) (actual time=764.665..764.665 rows=1 loops=3)
               Output: PARTIAL count(1)
               Buffers: shared hit=1376 read=64996
               Worker 0: actual time=763.246..763.246 rows=1 loops=1
                 Buffers: shared hit=488 read=23400
               Worker 1: actual time=764.496..764.496 rows=1 loops=1
                 Buffers: shared hit=514 read=23576
               ->  Parallel Seq Scan on public.testcount  (cost=0.00..136892.25 rows=7052025 width=0) (actual time=0.025..370.283 rows=5000000 loops=3)
                     Buffers: shared hit=1376 read=64996
                     Worker 0: actual time=0.018..350.004 rows=5398688 loops=1
                       Buffers: shared hit=488 read=23400
                     Worker 1: actual time=0.020..372.743 rows=5444268 loops=1
                       Buffers: shared hit=514 read=23576
 Planning time: 0.044 ms
 Execution time: 767.434 ms
(23 rows)

[local]:5432 sky@sky=# SELECT count(1) from testcount ;
  count   
----------
 15000000
(1 row)

Time: 502.905 ms

开启3个worker

[local]:5432 sky@sky=# set max_parallel_workers_per_gather = 3;
SET
Time: 0.177 ms
[local]:5432 sky@sky=# EXPLAIN (analyze, verbose, buffers) select count(1) from testcount ;
                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=135617.72..135617.73 rows=1 width=8) (actual time=757.778..757.778 rows=1 loops=1)
   Output: count(1)
   Buffers: shared hit=1802 read=64804
   ->  Gather  (cost=135617.40..135617.71 rows=3 width=8) (actual time=757.021..757.775 rows=4 loops=1)
         Output: (PARTIAL count(1))
         Workers Planned: 3
         Workers Launched: 3
         Buffers: shared hit=1802 read=64804
         ->  Partial Aggregate  (cost=134617.40..134617.41 rows=1 width=8) (actual time=753.981..753.981 rows=1 loops=4)
               Output: PARTIAL count(1)
               Buffers: shared hit=1568 read=64804
               Worker 0: actual time=749.686..749.686 rows=1 loops=1
                 Buffers: shared hit=511 read=11359
               Worker 1: actual time=753.743..753.743 rows=1 loops=1
                 Buffers: shared hit=481 read=16752
               Worker 2: actual time=755.592..755.592 rows=1 loops=1
                 Buffers: shared hit=334 read=20173
               ->  Parallel Seq Scan on public.testcount  (cost=0.00..120968.32 rows=5459632 width=0) (actual time=0.019..357.230 rows=3750000 loops=4)
                     Buffers: shared hit=1568 read=64804
                     Worker 0: actual time=0.015..315.006 rows=2682548 loops=1
                       Buffers: shared hit=511 read=11359
                     Worker 1: actual time=0.015..346.246 rows=3894658 loops=1
                       Buffers: shared hit=481 read=16752
                     Worker 2: actual time=0.015..408.609 rows=4634582 loops=1
                       Buffers: shared hit=334 read=20173
 Planning time: 0.033 ms
 Execution time: 757.836 ms
(27 rows)

[local]:5432 sky@sky=# SELECT count(1) from testcount ;
  count   
----------
 15000000
(1 row)

Time: 514.604 ms

开启4个worker

[local]:5432 sky@sky=# set max_parallel_workers_per_gather = 4;
SET
Time: 0.149 ms
[local]:5432 sky@sky=# EXPLAIN (analyze, verbose, buffers) select count(1) from testcount ;


                                                                       QUERY PLAN                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=120262.61..120262.62 rows=1 width=8) (actual time=799.792..799.792 rows=1 loops=1)
   Output: count(1)
   Buffers: shared hit=2264 read=64420
   ->  Gather  (cost=120262.19..120262.60 rows=4 width=8) (actual time=799.782..799.789 rows=5 loops=1)
         Output: (PARTIAL count(1))
         Workers Planned: 4
         Workers Launched: 4
         Buffers: shared hit=2264 read=64420
         ->  Partial Aggregate  (cost=119262.19..119262.20 rows=1 width=8) (actual time=791.188..791.188 rows=1 loops=5)
               Output: PARTIAL count(1)
               Buffers: shared hit=1952 read=64420
               Worker 0: actual time=793.609..793.609 rows=1 loops=1
                 Buffers: shared hit=93 read=11941
               Worker 1: actual time=783.423..783.423 rows=1 loops=1
                 Buffers: shared hit=352 read=12496
               Worker 2: actual time=787.593..787.593 rows=1 loops=1
                 Buffers: shared hit=612 read=14750
               Worker 3: actual time=791.640..791.640 rows=1 loops=1
                 Buffers: shared hit=528 read=14056
               ->  Parallel Seq Scan on public.testcount  (cost=0.00..108684.15 rows=4231215 width=0) (actual time=0.017..368.500 rows=3000000 loops=5)
                     Buffers: shared hit=1952 read=64420
                     Worker 0: actual time=0.017..389.196 rows=2719684 loops=1
                       Buffers: shared hit=93 read=11941
                     Worker 1: actual time=0.016..379.447 rows=2903648 loops=1
                       Buffers: shared hit=352 read=12496
                     Worker 2: actual time=0.015..371.381 rows=3471812 loops=1
                       Buffers: shared hit=612 read=14750
                     Worker 3: actual time=0.014..315.571 rows=3295912 loops=1
                       Buffers: shared hit=528 read=14056
 Planning time: 0.030 ms
 Execution time: 802.207 ms
(31 rows)

[local]:5432 sky@sky=# SELECT count(1) from testcount ;
  count   
----------
 15000000
(1 row)

Time: 513.976 ms

与并行相关的参数

德哥 阿里云栖社区 PostgreSQL 9.6 并行计算 优化器算法浅析