天天看點

PostgreSQL Oracle PL/SQL 相容性之 - AGGREGATE USING Clause 聚合函數

标簽

PostgreSQL , Oracle , PL/SQL , 聚合函數 , 自定義聚合函數

https://github.com/digoal/blog/blob/master/201805/20180511_01.md#%E8%83%8C%E6%99%AF 背景

Oracle的自定義聚合函數的定義方法,在建立函數是,使用AGGREGATE USING Clause關鍵詞。

AGGREGATE USING Clause

Specify AGGREGATE USING to identify this function as an aggregate function, or one that evaluates a group of rows and returns a single row. You can specify aggregate functions in the select list, HAVING clause, and ORDER BY clause.

When you specify a user-defined aggregate function in a query, you can treat it as an analytic function (one that operates on a query result set). To do so, use the OVER analytic_clause syntax available for built-in analytic functions. See "

Analytic Functions

" for syntax and semantics.

In the USING clause, specify the name of the implementation type of the function. The implementation type must be an object type containing the implementation of the ODCIAggregate routines. If you do not specify schema, Oracle Database assumes that the implementation type is in your own schema.

Restriction on Creating Aggregate Functions

If you specify this clause, you can specify only one input argument for the function.

自定義的聚合函數,與普通聚合函數一樣,可以用于聚合、KEEP等操作SQL中。

https://github.com/digoal/blog/blob/master/201805/20180511_01.md#postgresql-%E8%81%9A%E5%90%88%E5%87%BD%E6%95%B0%E7%94%A8%E6%B3%95 PostgreSQL 聚合函數用法

https://www.postgresql.org/docs/10/static/functions-aggregate.html 《PostgreSQL aggregate function 1 : General-Purpose Aggregate Functions》 《PostgreSQL aggregate function 2 : Aggregate Functions for Statistics》 《PostgreSQL aggregate function 3 : Aggregate Functions for Ordered-Set》 《PostgreSQL aggregate function 4 : Hypothetical-Set Aggregate Functions》

https://github.com/digoal/blog/blob/master/201805/20180511_01.md#postgresql-%E8%87%AA%E5%AE%9A%E4%B9%89%E8%81%9A%E5%90%88%E5%87%BD%E6%95%B0 PostgreSQL 自定義聚合函數

1、自定義普通聚合函數:

《PostgreSQL aggregate function customize》

2、自定義并行聚合函數:

《PostgreSQL Oracle 相容性之 - 自定義并行聚合函數 PARALLEL_ENABLE AGGREGATE》 《PostgreSQL 10 自定義并行計算聚合函數的原理與實踐 - (含array_agg合并多個數組為單個一進制數組的例子)》

3、在postgres-xc中自定義多階段分布式并行聚合函數:

《Postgres-XC customized aggregate introduction》

4、在greenplum中自定義多階段分布式并行聚合函數:

《Greenplum 最佳實踐 - 估值插件hll的使用(以及hll分式聚合函數優化)》

文法:

https://www.postgresql.org/docs/10/static/xaggr.html https://www.postgresql.org/docs/10/static/sql-createaggregate.html

https://github.com/digoal/blog/blob/master/201805/20180511_01.md#%E4%BE%8B%E5%AD%90 例子

1、普通聚合函數

目标:将結果聚合,并按某個字段輸出為有序數組。

測試表

create table recommendation_mpt (user_id int8, app_id numeric, rating numeric);    
insert into recommendation_mpt select generate_series(1,10000), generate_series(1,41), random();    
           

聚合過程中最後一步,資料排序處理函數

create or replace function final_array_agg (i_text text) returns text[] as $$    
declare    
  result text[];    
begin    
  select array_agg(app_id||'_'||rating) into result from     
    (select split_part(i,'_',1) as app_id,   
            split_part(i,'_',2) as rating   -- 按它排序  
      from     
      regexp_split_to_table(i_text,',') t(i)    
      order by 2 desc) t;    
  return result;    
end;    
$$ language plpgsql strict;    
           
create aggregate agg_append (text) (    
sfunc = textcat,    
stype = text,    
FINALFUNC = final_array_agg);    
           
select agg_append(app_id||'_'||rating||',') from recommendation_mpt;    
           
postgres=# select agg_append(app_id||'_'||rating||',') from recommendation_mpt;    
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
agg_append | {24_0.91642474103719,3_0.86293408786878,21_0.824714167509228,41_0.823069900739938,28_0.82022201269865,17_0.800656013656408,33_0.764910507481545,25_0.760074479039758,30_0.757540909573436,13_0.707890411838889,20_0.704598274547607,5_0.675859381910414,40_0.674109968356788,37_0.671832457184792,31_0.666503502987325,35_0.641303175128996,23_0.640862574335188,12_0.639161774888635,10_0.634707988705486,1_0.630520141683519,39_0.589550276752561,7_0.547058736439794,4_0.541917834896594,15_0.535650313366205,34_0.529437590856105,29_0.468865198083222,14_0.456227377057076,36_0.440769889391959,27_0.431988585740328,26_0.408387354109436,22_0.359426050912589,18_0.329283143393695,19_0.266014957334846,38_0.188361912034452,16_0.150509809609503,8_0.148780386894941,6_0.142394866328686,11_0.116577256470919,32_0.0993853402324021,2_0.00736959790810943,9_0.00227751117199659,_}  
           

當然,這個實際上現在PG已經内置了文法來支援,上面隻是示範一下自定義聚合函數。

内置ORDER BY,通過string_agg進行聚合:

postgres=# select string_agg(app_id||'_'||rating, ',' order by rating desc) from recommendation_mpt ;  
-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
string_agg | 24_0.91642474103719,3_0.86293408786878,21_0.824714167509228,41_0.823069900739938,28_0.82022201269865,17_0.800656013656408,33_0.764910507481545,25_0.760074479039758,30_0.757540909573436,13_0.707890411838889,20_0.704598274547607,5_0.675859381910414,40_0.674109968356788,37_0.671832457184792,31_0.666503502987325,35_0.641303175128996,23_0.640862574335188,12_0.639161774888635,10_0.634707988705486,1_0.630520141683519,39_0.589550276752561,7_0.547058736439794,4_0.541917834896594,15_0.535650313366205,34_0.529437590856105,29_0.468865198083222,14_0.456227377057076,36_0.440769889391959,27_0.431988585740328,26_0.408387354109436,22_0.359426050912589,18_0.329283143393695,19_0.266014957334846,38_0.188361912034452,16_0.150509809609503,8_0.148780386894941,6_0.142394866328686,11_0.116577256470919,32_0.0993853402324021,2_0.00736959790810943,9_0.00227751117199659  
           

https://github.com/digoal/blog/blob/master/201805/20180511_01.md#%E5%8F%82%E8%80%83 參考

https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_5009.htm 《PostgreSQL Oracle 相容性之 - PL/SQL pipelined》