天天看點

利用Oracle線上重定義Online Redefinition清理曆史資料

利用Oracle線上重定義Online Redefinition清理曆史資料

我在<了解Oracle線上重定義Online Redefinition>一文中介紹了Oracle線上重定義的特點及其使用步驟,Online Redefinition的适用場景很多,包括:  

  • Modify the storage parameters of a table or cluster
  • Move a table or cluster to a different tablespace
  • Add, modify, or drop one or more columns in a table or cluster
  • Add or drop partitioning support (non-clustered tables only)
  • Change partition structure
  • Change physical properties of a single table partition, including moving it to a different tablespace in the same schema
  • Change physical properties of a materialized view log or an Oracle Streams Advanced Queueing queue table
  • Add support for parallel queries
  • Re-create a table or cluster to reduce fragmentation
  • Change the organization of a normal table (heap organized) to an index-organized table, or do the reverse.
  • Convert a relational table into a table with object columns, or do the reverse.
  • Convert an object table into a relational table or a table with object columns, or do the reverse.

  但是Online Redefinition恰恰就是不支援對于表上曆史資料的清理( 甚至于資料更新update都是支援的), 對于存有海量資料且沒有分區的大表而言曆史資料的清理是非常頭痛的工作,特别是在7*24 應用不能下線的環境當中, 我們往往無法利用CTAS或INSERT APPEND+NOLOGGING的方式重建表以加速清理工作,而僅能使用最為原始的DELETE DML, 而我們知道DELETE操作是很緩慢的(與之前所提及的方法相比,真實世界中DELETE的效率還會受到INDEX clustering_factor聚集因子等因素的影響而顯得更慢,見<SQL調優:Clustering Factor影響資料删除速度一例>), 且為了避免ORA-01555快照過久的錯誤出現,我們不能簡單地使用一條DELETE SQL來清理資料,而需要使用PL/SQL匿名塊控制并定期commit送出以避免ORA-01555。   實際上我們還是可以通過将非分區表Online Redefinition轉換為以删除條件為Range範圍分區的Partition-ed Table,再直接Truncate Partiton的方法來加速曆史資料的清理, 同時又不影響業務的線上。   例如有如下非分區表一張:  

create table order_history
(
order_id number primary key,
issue_date date ,
location varchar2(200),
amount number,
maclean varchar2(200),
QUANTITY_SOLD number,
PROMO_ID number,
CUST_ID number,
CHANNEL_ID number) tablespace users pctfree 0;

SQL> select count(*) from ORDER_HISTORY;

  COUNT(*)
----------
   4550092

SQL> select count(*) from ORDER_HISTORY where issue_date< sysdate-365;

  COUNT(*)
----------
   3412569

SQL> exec dbms_stats.gather_table_stats('SH','ORDER_HISTORY');

PL/SQL procedure successfully completed.      

  該表上存有450萬條Order記錄 , 其中340萬條是一年之前的曆史記錄需要予以删除。 我們先着手将該非分區表線上沖定義為以issue_date日期為範圍的Range Partition。   以下我們會建立臨時分區表,并完成Online Redefinition:  

create table order_history_int
(
order_id number primary key,
issue_date date ,
location varchar2(200),
amount number,
maclean varchar2(200),
QUANTITY_SOLD number,
PROMO_ID number,
CUST_ID number,
CHANNEL_ID number)
partition by range(issue_date)
( partition p1 values less than (to_date('2009-01-01','YYYY-MM-DD')) tablespace users,
  partition p2 values less than (to_date('2010-01-01','YYYY-MM-DD')) tablespace users,
  partition p3 values less than (to_date('2011-01-01','YYYY-MM-DD')) tablespace users,
  partition p4 values less than (to_date('2012-01-01','YYYY-MM-DD')) tablespace users,
  partition p5 values less than (maxvalue) tablespace users);

set timing on;

SQL> begin
  2    dbms_redefinition.can_redef_table(uname        => 'SH',
  3                                      tname        => 'ORDER_HISTORY',
  4                                      options_flag => DBMS_REDEFINITION.cons_use_pk);
  5  end;
  6  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.10

這裡采用了DBMS_REDEFINITION.cons_use_pk, 即Primary Key的重定義方式
若沒有主鍵和僞主鍵,那麼也可以選擇rowid的方式,當然我們更推薦使用主鍵

SQL> begin
  2    DBMS_REDEFINITION.START_REDEF_TABLE(uname        => 'SH',
  3                                        orig_table   => 'ORDER_HISTORY',
  4                                        int_table    => 'ORDER_HISTORY_INT',
  5                                        options_flag => DBMS_REDEFINITION.cons_use_pk);
  6  end;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:25.10 

SQL> select count(*) from ORDER_HISTORY_INT;

  COUNT(*)
----------
   4550092

Elapsed: 00:00:00.24

SQL> DECLARE
  2    num_errors PLS_INTEGER;
  3  BEGIN
  4    DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS(uname            => 'SH',
  5                                            orig_table       => 'ORDER_HISTORY',
  6                                            int_table        => 'ORDER_HISTORY_INT',
  7                                            copy_indexes     => 0,
  8                                            copy_triggers    => TRUE,
  9                                            copy_constraints => FALSE,
 10                                            copy_privileges  => TRUE,
 11                                            ignore_errors    => FALSE,
 12                                            num_errors       => num_errors,
 13                                            copy_statistics  => TRUE);
 14  END;
 15  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:07.82

SQL> begin
  2    dbms_redefinition.finish_redef_table(uname      => 'SH',
  3                                         orig_table => 'ORDER_HISTORY',
  4                                         int_table  => 'ORDER_HISTORY_INT');
  5  end;
  6  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:04.87      

  以上僅用了不到2分鐘時間就完成了對ORDER_HISTORY曆史表的線上重定義,緊接着我們隻需要truncate前幾年的曆史分區即可,因為truncate操作是DDL語句且與DELETE的機制大相徑庭,是以速度一般是很快的: 但是請注意若表上有global Index,那麼這樣可能會導緻索引失效,是以建議在Online Redefinition時将全局索引轉換為本地索引,或者使用update global indexes;子句    

SQL>  alter table ORDER_HISTORY truncate partition p1 [update global indexes];

Table truncated.

Elapsed: 00:00:00.21

SQL> alter table ORDER_HISTORY truncate partition p2 [update global indexes];

Table truncated.

Elapsed: 00:00:00.07

SQL> alter table ORDER_HISTORY truncate partition p3 [update global indexes];

Table truncated.

Elapsed: 00:00:00.07

SQL> select count(*) from ORDER_HISTORY;

  COUNT(*)
----------
   1137523

Elapsed: 00:00:00.14

SQL> select count(*) from ORDER_HISTORY where issue_date< sysdate-365;

  COUNT(*)
----------
         0

Elapsed: 00:00:00.02      

    以上我們通過Oracle線上重定義Online Redefinition特性将非分區的曆史資料表線上轉化為以日期為範圍的分區表,利用分區表truncate partition清理資料的便利性來解決傳統以PL/SQL過程遊标擷取rowid循環方式删除曆史資料十分緩慢的問題,同時也保證了業務應用的始終線上。