寫有效的曆史資料遷移sql

對于高并發oltp系統，生産庫可能隻需保留目前幾個月的資料，之前的資料要全部遷移到曆史庫中。那麼，如何處理這樣的需求，如何寫合适的曆史遷移程式呢？

1.正常寫法

begin

--遷移資料

insert into tb_users

select *

from tb_users_dbc a

where gmt_modified <= trunc(sysdate) - v_day and rownum<5000000;

--删除原表資料

delete from tb_users_dbc

where gmt_modified <= trunc(sysdate) - v_day and rownum<5000000;

commit;

end;

2.使用for循環

declare

v_counts number := 0;

begin

--中間表

insert into tmp_tb_users

select *

from tb_users_dbc

where gmt_modified <= trunc(sysdate) - v_day and rownum<5000000;

commit;

for i in (select id from tmp_tb_users) loop

--資料遷移

insert into tb_users

select * from tb_users_dbc where id = i.id;

delete from tb_users_dbc where id = i.id;

v_counts := v_counts + 1;

--分批送出

if mod(v_counts, 100) = 0 then

commit;

end if;

end loop;

end;

3.使用bulk collect

declare

type rec_rids is table of varchar2(32);

v_rids rec_rids;

cursor cur_aids is

select id

from tb_users_dbc

where gmt_modified <= trunc(sysdate) - v_day and rownum<5000000;

begin

open cur_aids;

loop

fetch cur_aids BULK COLLECT INTO v_rids LIMIT 1000;

for i in 1 .. v_rids.count loop

--資料遷移

insert into tb_users

select * from tb_users_dbc where id = v_rids(i);

delete from tb_users_dbc where id = v_rids(i);

end loop;

commit;

EXIT WHEN v_rids.COUNT = 0;

end loop;

close cur_aids;

end;

很少考慮正常寫法，資料遷移一般會選擇采用2，3種方法。

相比for遷移方式，采用bulk collect避免了上下文切換，在大批量資料遷移中還是有很大優勢的，推薦采用bulk collect資料遷移方案。

Asktom上有很多關于曆史遷移方案的文章，Tom的答案一直都是正常寫法，也就是上面的第一種寫法，可是正常寫法消耗大量的資源，出現異常時整個事務都得復原。是以很多人都認為Tom的方案不可行，認為Tom并沒有接觸過大資料庫等。

4.借助Tom的思想，把遷移的資料拆分成n個小表，對n個小表進行遷移

declare

pagecount number;

sumcount number;

loopcount number;

begin

--取要遷移的資料

execute immediate 'truncate table tmp_tb_users';

insert into tmp_tb_users

select id

from tb_users_dbc

where gmt_modified <= trunc(sysdate) - v_day and rownum < 5000000;

--計算遷移總數

select count(*) into sumcount from tmp_tb_users;

--設定每次遷移條數

pagecount = 2000;

--得到循環次數

loopcount := sumcount / pagecount + 1;

for i in 1 .. loopcount loop

begcount := (i - 1) * pagecount + 1;

endcount := i * pagecount;

--建構中間小表

execute immediate 'truncate table mid_tb_users';

insert into mid_tb_users

select id, b.rn

from tmp_tb_users b where b.rn >= begcount and b.rn <= endcount;

--小表和原表進行關聯，遷移資料

insert into tb_users

select f.*

from mid_tb_users t, tb_users_dbc f

where t.biz_order_id = f.biz_order_id

and t.rn >= begcount and t.rn < = endcount;

--删除原表資料

delete from tb_users_dbc where id in (select id from mid_tb_users);

commit;

end loop;

end;

1.優于正常寫法，可以分批多次進行送出，加入異常處理可以避免全部資料復原。

2.優于for遷移，借助中間小表一次遷移多條記錄，大大降低了insert，delete的執行次數。

3.主要是提供給大家一種思路。

結論：

其實寫法好沒有好壞之分，關鍵在于怎麼用，就大批量遷移資料來說，我覺得3，4都是比較可行的方案。歡迎大家拍磚讨論，也歡迎貢獻更好的資料遷移辦法。

注：上面腳本是僞代碼

<!--

--EOF--

來自 “ ITPUB部落格 ” ，連結：http://blog.itpub.net/10742223/viewspace-235564/，如需轉載，請注明出處，否則将追究法律責任。

轉載于:http://blog.itpub.net/10742223/viewspace-235564/

寫有效的曆史資料遷移sql

繼續閱讀

Testlink安裝部署之XAMPP

set define off關閉替代變量功能

報錯：'mysql' 不是内部或外部指令，也不是可運作的程式或批處理檔案。

Linxu常用指令技巧彙總

ERROR 1 (HY000): Can't create/write to file '/tmp/#sql_4188_1.MYI' (Errcode: 28)

艱難安裝LDAP,SSL認證

《Linux指令行與Shell腳本程式設計大全第2版.布盧姆》pdf

MySQL的4種隔離級别？出現問題

XX系統實施過程問題總結

無元件上傳圖檔到資料庫中，最完整解決方案

【MySQL資料庫】資料庫索引事務1.索引2.事務

neo4j之cypher使用文檔

NOSQL安全攻擊

mybatis_入門程式Mybatis入門

登入plsql 報錯 the account is locked --使用者被鎖

SequoiaDB巨杉資料庫C++驅動概述