示例表数据如下:

test=# select * from tcost ;
 path | cost 
------+-------
  111 |  23.3
  111 |  33.4
  111 |   3.4
  222 |   3.4
  222 |  33.4
  222 | 333.4
   32 |   3.4
   32 |   0.4
   32 |  0.04
(9 rows)

test=# 

ROWS

test=# select path, cost, sum(cost) over (order by cost desc) as sum_cost, sum(cost) over (order by cost desc rows between current row and 2 following ) as row from tcost;
 path | cost  | sum_cost |  row 
------+-------+----------+-------
  222 | 333.4 |    333.4 | 400.2
  111 |  33.4 |    400.2 |  90.1
  222 |  33.4 |    400.2 |  60.1
  111 |  23.3 |    423.5 |  30.1
  111 |   3.4 |    433.7 |  10.2
   32 |   3.4 |    433.7 |   7.2
  222 |   3.4 |    433.7 |  3.84
   32 |   0.4 |    434.1 |  0.44
   32 |  0.04 |   434.14 |  0.04
(9 rows)

test=# 

可以看到 max2 的值都是由 current row (当前行) and 2 following(紧接着2行)的sum()结果出来的。 即 400.2 = 333.4 + 33.4 + 33.4 得出的。 90.1 = 33.4 + 33.4 + 23.3 得出的。

注意,上面那条SQL没有写 partition by ,那默认情况下就是以整个表来表示窗口化的,即只有一个窗口。现在试着,添加上partition by 语句的结果看看:

test=# select path, cost, sum(cost) over (order by cost desc) as sum_cost, sum(cost) over (partition by path order by cost desc rows between current row and 2 following ) as row from tcost;
 path | cost  | sum_cost |  row 
------+-------+----------+-------
   32 |   3.4 |    433.7 |  3.84
   32 |   0.4 |    434.1 |  0.44
   32 |  0.04 |   434.14 |  0.04
  111 |  33.4 |    400.2 |  60.1
  111 |  23.3 |    423.5 |  26.7
  111 |   3.4 |    433.7 |   3.4
  222 | 333.4 |    333.4 | 370.2
  222 |  33.4 |    400.2 |  36.8
  222 |   3.4 |    433.7 |   3.4
(9 rows)

test=# 

当 partition by path时,可以看到,这些ROWS 模式,都是在当前所在的窗口来进行的,并不会跨窗口来进行。 所以,在这里强调一下,ROWS表示的是物理行。

RANGE

看看,当是RANGE时的结果

test=# select path, cost, sum(cost) over (order by cost desc) as sum_cost, sum(cost) over (order by cost desc range  between current row and  UNBOUNDED  following ) as range from tcost;
 path | cost  | sum_cost | range 
------+-------+----------+--------
  222 | 333.4 |    333.4 | 434.14
  111 |  33.4 |    400.2 | 100.74
  222 |  33.4 |    400.2 | 100.74
  111 |  23.3 |    423.5 |  33.94
  111 |   3.4 |    433.7 |  10.64
   32 |   3.4 |    433.7 |  10.64
  222 |   3.4 |    433.7 |  10.64
   32 |   0.4 |    434.1 |   0.44
   32 |  0.04 |   434.14 |   0.04
(9 rows)

test=# 

可以看到,RANGE时,相同数据的会被合并到一起再来进行计算,也表明,列中具有相同值的range的值也是相同的,并且结果是它们合并后进行计算后的结果。

test=# select path, cost, sum(cost) over (order by cost desc) as sum_cost, sum(cost) over (order by cost desc range  between current row and  UNBOUNDED  following ) as range,sum(cost) over (order by cost desc rows between current row and UNBOUNDED  following ) as row from tcost;
 path | cost  | sum_cost | range  |  row  
------+-------+----------+--------+--------
  222 | 333.4 |    333.4 | 434.14 | 434.14
  111 |  33.4 |    400.2 | 100.74 | 100.74
  222 |  33.4 |    400.2 | 100.74 |  67.34
  111 |  23.3 |    423.5 |  33.94 |  33.94
  111 |   3.4 |    433.7 |  10.64 |  10.64
   32 |   3.4 |    433.7 |  10.64 |   7.24
  222 |   3.4 |    433.7 |  10.64 |   3.84
   32 |   0.4 |    434.1 |   0.44 |   0.44
   32 |  0.04 |   434.14 |   0.04 |   0.04
(9 rows)

test=# 

这里可以非常明显看到RANGE和ROWS的区别。 ROWS:是按物理行来进行区分的 RANGE:是按数值进行逻辑区分的

RANGE 和 ROWS 在PostgreSQL中的语法

[ RANGE | ROWS ] frame_start
[ RANGE | ROWS ] BETWEEN frame_start AND frame_end

frame_start 和 frame_end可以是:

UNBOUNDED PRECEDING
value PRECEDING
CURRENT ROW
value FOLLOWING
UNBOUNDED FOLLOWING

特别注意:value PRECEDING和value FOLLOWING 当前只允许ROWS模式。 RANGE模式后面只能接 UNBOUNDED FOLLOWING。

默认的框架选项是RANGE UNBOUNDED PRECEDING,该选项与 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW相同。有ORDER BY, 它设置框架从分区的开始一直到与当前行相同的最后一行。没有ORDER BY, 那么就是当前分区的所有行都包含在框架中,因为所有行都会成为当前行的相同行。

RANGE时,请注意有没有 order by 的区别

test=# select path, cost, sum(cost) over (order by cost desc) as sum_cost, sum(cost) over (range  between current row and  UNBOUNDED  following ) as no_order_by_range,sum(cost) over (order by cost desc range between current row and UNBOUNDED  following ) as has_order_by_range from tcost;
 path | cost  | sum_cost | no_order_by_range | has_order_by_range
------+-------+----------+-------------------+--------------------
  222 | 333.4 |    333.4 |            434.14 |             434.14
  111 |  33.4 |    400.2 |            434.14 |             100.74
  222 |  33.4 |    400.2 |            434.14 |             100.74
  111 |  23.3 |    423.5 |            434.14 |              33.94
  111 |   3.4 |    433.7 |            434.14 |              10.64
   32 |   3.4 |    433.7 |            434.14 |              10.64
  222 |   3.4 |    433.7 |            434.14 |              10.64
   32 |   0.4 |    434.1 |            434.14 |               0.44
   32 |  0.04 |   434.14 |            434.14 |               0.04
(9 rows)

test=# 
没有ORDER BY, 那么就是当前分区的所有行都包含在框架中,因为所有行都会成为当前行的相同行。

ROWS时,请注意有没有 order by 的区别

test=# select path, cost, sum(cost) over (order by cost desc) as sum_cost, sum(cost) over (rows  between current row and  UNBOUNDED  following ) as no_order_by_rows,sum(cost) over (order by cost desc rows between current row and UNBOUNDED  following ) as has_order_by_rows from tcost;
 path | cost  | sum_cost | no_order_by_rows | has_order_by_rows
------+-------+----------+------------------+-------------------
  222 | 333.4 |    333.4 |           434.14 |            434.14
  111 |  33.4 |    400.2 |           100.74 |            100.74
  222 |  33.4 |    400.2 |            67.34 |             67.34
  111 |  23.3 |    423.5 |            33.94 |             33.94
  111 |   3.4 |    433.7 |            10.64 |             10.64
   32 |   3.4 |    433.7 |             7.24 |              7.24
  222 |   3.4 |    433.7 |             3.84 |              3.84
   32 |   0.4 |    434.1 |             0.44 |              0.44
   32 |  0.04 |   434.14 |             0.04 |              0.04
(9 rows)

test=# 

有没有ORDER BY,都是一样的,因为ROWS是按物理分行的,而不是按逻辑分行的。

总结

ROWS:是按物理行来进行窗口级别里再次进行范围选择的。
RANGE:是按逻辑行来进行窗口级别里再次进行范围选择的。RANGE时,相同行会被合并成同一条数据再进行计算,相同行窗口计算时的结果也是相同的。
是否是相同行,是根据ORDER BY排序时的结果决定的。
有ORDER BY时:同行是说在ORDER BY排序时不唯一的行。【即具有相同数值的行】
             不同行是说ORDER BY排序时具有不同的数值的行。

没有ORDER BY:那么就是当前分区的所有行都包含在框架中,因为所有行都会成为当前行的相同行。【特别要注意最后一句的意思】