天天看點

淺析InnoDB Record Header及page overflow(2)

3、案例測試驗證過程

我們以場景18為例做測試。測試表結構見下(橫屏觀看,下同)

[[email protected]]> CREATE TABLE `t1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `c1` varchar(10) DEFAULT NULL,
  `c2` varchar(10) DEFAULT NULL,
  `c3` varchar(10) DEFAULT NULL,
  `c4` varchar(10) DEFAULT NULL,
  `c5` varchar(10) DEFAULT NULL,
  `c6` varchar(10) DEFAULT NULL,
  `c7` varchar(10) DEFAULT NULL,
  `c8` varchar(10) DEFAULT NULL,
  `c9` varchar(10) DEFAULT NULL,
  `c10` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=DYNAMIC;      

插入兩條資料

[[email protected]]>insert into t1 values(1,'a','a','a','a','a','a','a','a','a','a');
[[email protected]]>insert into t1 values(2,'b','b','b','b','b','b','b','b','b','b');      

用innblock工具檢視資料結構:

[[email protected]]# innblock innodb/t2_varchar.ibd 3 16
...
-----Total used rows:4 used rows list(phy):
(1) INFIMUM record offset:99 heapno:0 n_owned 1,delflag:N minflag:0 rectype:2
(2) SUPREMUM record offset:112 heapno:1 n_owned 3,delflag:N minflag:0 rectype:3
(3) normal record offset:137 heapno:2 n_owned 0,delflag:N minflag:0 rectype:0
(4) normal record offset:181 heapno:3 n_owned 0,delflag:N minflag:0 rectype:0      

可以看到一條實體記錄的長度是 181-137=44位元組,分别由以下幾部分組成:

  • record header:17位元組
    • 其中基礎RH=5
    • 共10個varchar列均允許為NULL是以需要額外10bit折算成2位元組
    • 另外每個varchar列最大定義存儲長度10*4=40,沒超過256位元組,每個varchar列需要額外1個位元組,共10位元組
    • 那麼5+2+10=17位元組
  • db_trx_id: 6位元組
  • db_roll_ptr: 7位元組
  • id列: 4位元組
  • c1 ~ c10列:10位元組

我們再用innodb_ruby工具驗證一下:

[[email protected]]# innodb_space -s ibdata1 -T innodb/t1 -p 3 page-dump
...
records:
{:format=>:compact,
 :offset=>137,
 :header=>
  {:next=>181,
   :type=>:conventional,
   :heap_number=>2,
   :n_owned=>0,
   :min_rec=>false,
   :deleted=>false,
   :nulls=>[],
   :lengths=>
    {"c1"=>1,
     "c2"=>1,
     "c3"=>1,
     "c4"=>1,
     "c5"=>1,
     "c6"=>1,
     "c7"=>1,
     "c8"=>1,
     "c9"=>1,
     "c10"=>1},
   :externs=>[],
   :length=>17}, #所有record header總大小是17位元組
 :next=>181,
 :type=>:clustered,
 :key=>[{:name=>"id", :type=>"INT UNSIGNED", :value=>1}],
 :row=>
  [{:name=>"c1", :type=>"VARCHAR(40)", :value=>"a"},
   {:name=>"c2", :type=>"VARCHAR(40)", :value=>"a"},
   {:name=>"c3", :type=>"VARCHAR(40)", :value=>"a"},
   {:name=>"c4", :type=>"VARCHAR(40)", :value=>"a"},
   {:name=>"c5", :type=>"VARCHAR(40)", :value=>"a"},
   {:name=>"c6", :type=>"VARCHAR(40)", :value=>"a"},
   {:name=>"c7", :type=>"VARCHAR(40)", :value=>"a"},
   {:name=>"c8", :type=>"VARCHAR(40)", :value=>"a"},
   {:name=>"c9", :type=>"VARCHAR(40)", :value=>"a"},
   {:name=>"c10", :type=>"VARCHAR(40)", :value=>"a"}],
 :sys=>
  [{:name=>"DB_TRX_ID", :type=>"TRX_ID", :value=>10518},
   {:name=>"DB_ROLL_PTR",
    :type=>"ROLL_PTR",
    :value=>
     {:is_insert=>true,
      :rseg_id=>100,
      :undo_log=>{:page=>491, :offset=>272}}}],
 :length=>27, #row data總大小是27位元組,加上RH的17位元組,總共是44位元組,和推測結果一緻
 :transaction_id=>10518,
 :roll_pointer=>
  {:is_insert=>true, :rseg_id=>100, :undo_log=>{:page=>491, :offset=>272}}}
...      

4、什麼時候發生page overflow

我們大概知道,一條記錄由于有較多的變長列(varchar/text/blob類型),當長度約為page size的一半時,就會發生overflow,會把最長的那個列存儲在獨立的page中,聚集索引中用20位元組的指針指向那個page(dynamic row format時是這麼做的,不同row format處理方式也不同)。文檔中是這麼說的

Variable-length columns are an exception to the rule that column values are stored in B-tree index nodes. Variable-length columns that are too long to fit on a B-tree page are stored on separately allocated disk pages called overflow pages. Such columns are referred to as off-page columns. The values of off-page columns are stored in singly-linked lists of overflow pages, with each such column having its own list of one or more overflow pages. Depending on column length, all or a prefix of variable-length column values are stored in the B-tree to avoid wasting storage and having to read a separate page.

When a table is created with ROW_FORMAT=DYNAMIC, InnoDB can store long variable-length column values (for VARCHAR, VARBINARY, and BLOB and TEXT types) fully off-page, with the clustered index record containing only a 20-byte pointer to the overflow page. Fixed-length fields greater than or equal to 768 bytes are encoded as variable-length fields. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.

Whether columns are stored off-page depends on the page size and the total size of the row. When a row is too long, the longest columns are chosen for off-page storage until the clustered index record fits on the B-tree page. TEXT and BLOB columns that are less than or equal to 40 bytes are stored in line.

我們來測試下,一條記錄長度到底是多少時,會發生overflow。測試表結構

[[email protected]]> CREATE TABLE `t1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c1` text NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;      

我們先進行反推,計算每條記錄最長能存儲多少位元組:

  • 預設的page size是16KB,也就是16384位元組
  • FIL Header消耗38位元組
  • Index Header消耗36位元組
  • File Segment Header消耗20位元組
  • Infimum & Supremum 兩條虛拟記錄共消耗26位元組
  • FIL Trailer消耗8位元組
  • 由于此時page中最多隻有兩條實體記錄,是以Page Directory消耗4位元組
  • 每條記錄還需要額外消耗DB_TRX_ID(6B)、DB_ROLL_PTR(7B)共13位元組
  • 上述測試表的record header需要消耗7位元組
  • 最後可算得每條記錄中,text列最多不可超過(16384-38-36-20-26-8-4)/2-13-7-4=8102位元組
  • 經過實測,text列最多可存儲8101位元組,一旦超過該值,就會發生overflow
  • 如果不預設表中各個列資料類型的話,當一條記錄超過8125位元組(含可變長度的record header,以及DB_TRX_ID、DB_ROLL_PTR的13位元組在内)時,就會發生overflow

插入兩條測試資料

[[email protected]]> insert into t1 select 1,repeat('a',8102);
[[email protected]]> insert into t1 select 2,repeat('a',8101);      

用innblock工具解析

[[email protected]]# innblock innodb/t1.ibd 3 16
...
-----Total used rows:4 used rows list(phy):
(1) INFIMUM record offset:99 heapno:0 n_owned 1,delflag:N minflag:0 rectype:2
(2) SUPREMUM record offset:112 heapno:1 n_owned 3,delflag:N minflag:0 rectype:3
(3) normal record offset:127 heapno:2 n_owned 0,delflag:N minflag:0 rectype:0
(4) normal record offset:171 heapno:3 n_owned 0,delflag:N minflag:0 rectype:0
...      

注意到第一條記錄的長度是 171-127=44位元組,可見的确是做overflow處理了。再用innodb_ruby工具解析整個page

[[email protected]]# innodb_space -s ibdata1 -T innodb/t1 -p 3 page-dump
...
records:
{:format=>:compact,
 :offset=>127,
 :header=>
  {:next=>171,
...
 :key=>[{:name=>"id", :type=>"INT", :value=>1}],
 :row=>
  [{:name=>"c1",
    :type=>"BLOB",
    :value=>"",
    :extern=>{:space_id=>214, :page_number=>4, :offset=>38, :length=>8102}}],
...

{:format=>:compact,
 :offset=>171,
 :header=>
  {:next=>112,
   :type=>:conventional,
...
 :key=>[{:name=>"id", :type=>"INT", :value=>2}],
 :row=>
  [{:name=>"c1",
    :type=>"BLOB",
    :value=>
     "aaaaaaa...a"}],
 :sys=>
  [{:name=>"DB_TRX_ID", :type=>"TRX_ID", :value=>10619},
   {:name=>"DB_ROLL_PTR",
    :type=>"ROLL_PTR",
    :value=>
     {:is_insert=>true, :rseg_id=>36, :undo_log=>{:page=>466, :offset=>272}}}],
 :length=>8118,
 :transaction_id=>10619,
 :roll_pointer=>
  {:is_insert=>true, :rseg_id=>36, :undo_log=>{:page=>466, :offset=>272}}}
...          

從page dump的結果能看到,第一條記錄溢出存儲在另一個page(pageno=4),溢出的位元組數是8102,也就是全部放在獨立的page裡存儲了,聚集索引中隻保留了20位元組的指針。

我們繼續測試當表裡有兩個text列的情況

[[email protected]]> CREATE TABLE `t1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c1` text NOT NULL,
  `c2` text NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

[[email protected]]> insert into t1 select 1,repeat('a',100),repeat('a',8001);
[[email protected]]> insert into t1 select 2,repeat('a',100),repeat('a',8000);      
[[email protected]]# innblock innodb/t1.ibd 3 16
...
-----Total used rows:4 used rows list(phy):
(1) INFIMUM record offset:99 heapno:0 n_owned 1,delflag:N minflag:0 rectype:2
(2) SUPREMUM record offset:112 heapno:1 n_owned 3,delflag:N minflag:0 rectype:3
(3) normal record offset:128 heapno:2 n_owned 0,delflag:N minflag:0 rectype:0
(4) normal record offset:273 heapno:3 n_owned 0,delflag:N minflag:0 rectype:0
...      

注意到第一條記錄的長度是 273-128=145位元組,可見的确是做overflow處理了。再用innodb_ruby工具解析整個page

[[email protected]]# innodb_space -s ibdata1 -T innodb/t1 -p 3 page-dump
...
{:format=>:compact,
 :offset=>128,
 :header=>
  {:next=>273,
...
:key=>[{:name=>"id", :type=>"INT", :value=>1}],
 :row=>
  [{:name=>"c1",
    :type=>"BLOB",
    :value=>
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"},
   {:name=>"c2",
    :type=>"BLOB",
    :value=>"",
    :extern=>{:space_id=>215, :page_number=>4, :offset=>38, :length=>8001}}],
...
{:format=>:compact,
 :offset=>273,
 :header=>
  {:next=>112,
...
 :key=>[{:name=>"id", :type=>"INT", :value=>2}],
 :row=>
  [{:name=>"c1",
    :type=>"BLOB",
    :value=>
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"},
   {:name=>"c2",
    :type=>"BLOB",
    :value=>
     "a..."}],
 :sys=>
  [{:name=>"DB_TRX_ID", :type=>"TRX_ID", :value=>10686},
   {:name=>"DB_ROLL_PTR",
    :type=>"ROLL_PTR",
    :value=>
     {:is_insert=>true, :rseg_id=>78, :undo_log=>{:page=>478, :offset=>272}}}],
 :length=>8117,
 :transaction_id=>10686,
 :roll_pointer=>
  {:is_insert=>true, :rseg_id=>78, :undo_log=>{:page=>478, :offset=>272}}}            

從page dump的結果能看到,第一條記錄溢出存儲在另一個page(pageno=4),溢出的位元組數是8001,也是全部放在獨立的page裡存儲了,聚集索引中隻保留了20位元組的指針。

好吧,測試案例就介紹到這裡,更多的場景請自行測試。

P.S,如果想偷懶不自己測試,也可以看看我的 InnoDB Record Header消耗測試過程實錄(文末提供連結)。

P.P.S,我不是源碼級MySQL核心開發者,水準有限,文中難免有誤之處,還請多指教。

Enjoy MySQL :)