MySQL8.0 - 新特性 - 說說InnoDB Log System的隐藏參數

InnoDB在設計lock-free的log system時，除了已有的參數外，還通過宏控制隐藏了一些參數，如果你使用源碼編譯時，打開cmake選項-DENABLE_EXPERIMENT_SYSVARS=1, 就可以看到這些參數了。本文主要簡單的過一下這些隐藏的參數所代表的含義

innodb_log_write_events

innodb_log_flush_events

兩者的含義類似，表示用來喚醒等待log write/flush的event的個數，預設值都是2048

比如你要等待的位置在lsnA，那麼計算的slot為:

slot = (lsnA - 1) /OS_FILE_LOG_BLOCK_SIZE & (innodb_log_write/flush_events - 1)

這意味着：如果事務的commit log的end lsn落在相同block裡，他們可能産生event的競争

當然如果不在同一個block的時候，如果調大參數，就可以減少競争，但也會有無效的喚醒

喚醒操作通常由背景線程log_write_notifier 或者log_flush_notifier異步來做，但如果推進的log write/flush還不足一個block的話，那就log_writter/flusher

自己去喚醒了。

innodb_log_recent_written_size, 預設1MB

表示recent_written這個link_buf的大小，其實控制了并發往log buffer中同時拷貝的事務日志量，向前由新的日志加入，後面由log writer通過寫日志向前推進，如果寫的慢的話，那這個link_buf很可能用滿，使用者線程就得spin等待。再慢io的系統上，我們可以稍微調大這個參數

innodb_Log_recent_closed_size, 預設2MB

表示recent closed這個link_buf的大小，也是維護可以并發往flush list上插入髒頁的并罰度，如果插入髒頁速度慢，或者lin_buf沒有及時合并推進，就會spin wait

簡單說下link_buf, 這本質上是一個數組，但使用無鎖的使用方式來維護lsn的推進，比如獲得一個lsn開始和結束，那就
通過設定buf[start_lsn] = end_lsn的類似方式來維護lsn鍊，基于lsn是連續值的事實，最終必然不會出現空洞，是以在演化的過程中，可以從尾部
推進連續的lsn，頭部插入新的值.
如果新插入的值超過了尾部，表示buf滿了，就需要spin wait了

innodb_log_wait_for_write_spin_delay，

innodb_log_wait_for_write_timeout

從8.0版本開始使用者線程不再自己去寫redo，而是等待背景線程去寫，這兩個變量控制了spin以及condition wait的timeout時間，當spin一段時間還沒推進到某個想要的lsn點時，就會進入condition wait

另外兩個變量

innodb_log_wait_for_flush_spin_delay

innodb_log_wait_for_flush_timeout

含義類似，但是是等待log flush到某個指定lsn

注意在實際計算過程中，最大spin次數，會考慮到cpu使用率，以及另外兩個參數:

innodb_log_spin_cpu_abs_lwm

innodb_log_spin_cpu_pct_hwm

如果是等待flush操作的話，還收到參數innodb_log_wait_for_flush_spin_hwm限制，該參數控制了等待flush的時間上限，如果平均等待flush的時間超過了這個上限的話, 就沒必要去spin，而是直接進入condition wait

關于spin次數的計算方式在函數

log_max_spins_when_waiting_in_user_thread

中":

函數的參數即為配置項innodb_log_wait_for_write_spin_delay或innodb_log_wait_for_flush_spin_delay值

static inline uint64_t log_max_spins_when_waiting_in_user_thread(
    uint64_t min_non_zero_value) {
  uint64_t max_spins;

  /* Get current cpu usage. */
  const double cpu = srv_cpu_usage.utime_pct;

  /* Get high-watermark - when cpu usage is higher, don't spin! */
  const uint32_t hwm = srv_log_spin_cpu_pct_hwm;

  if (srv_cpu_usage.utime_abs < srv_log_spin_cpu_abs_lwm || cpu >= hwm) {
    /* Don't spin because either cpu usage is too high or it's
    almost idle so no reason to bother. */
    max_spins = 0;

  } else if (cpu >= hwm / 2) {
    /* When cpu usage is more than 50% of the hwm, use the minimum allowed
    number of spin rounds, not to increase cpu usage too much (risky). */
    max_spins = min_non_zero_value;

  } else {
    /* When cpu usage is less than 50% of the hwm, choose maximum spin rounds
    in range [minimum, 10*minimum]. Smaller usage of cpu is, more spin rounds
    might be used. */
    const double r = 1.0 * (hwm / 2 - cpu) / (hwm / 2);

    max_spins =
        static_cast<uint64_t>(min_non_zero_value + r * min_non_zero_value * 9);
  }

  return (max_spins);
}

D. 以下幾個參數是背景線程等待任務時spin及condition wait timeout的值

log_writer線程：

innodb_log_writer_spin_delay,

innodb_log_writer_timeout

log_flusher線程：

innodb_ log_flusher_spin_delay

innodb_log_flusher_timeout

log_write_notifier線程：

innodb_ log_write_notifier_spin_delay

innodb_log_write_notifier_timeout

log_flush_notifier線程

innodb_log_flush_notifier_spin_delay

innodb_log_flush_notifier_timeout

log_closer線程（用于推進recent_closed這個link_buf的專用線程）

innodb_log_closer_spin_delay

innodb_log_closer_timeout

innodb_ log_write_max_size

表示允許一個write操作最大的位元組數，預設為4kb，這個是在推進recent_written這個link buf時計算的，個人認為這個限制太小了，可以适當調大這個參數。（然而8.0的最大寫入限制還受到innodb_log_write_ahead_size限制，兩者得綜合起來看）

innodb_log_checkpoint_every

預設1000毫秒（1秒），表示至少每隔這麼長時間log_checkpointer線程會去嘗試做一次checkpoint. 當然是否做checkpoint還受到其他因素的影響，具體見函數

log_should_checkpoint

a) more than 1s elapsed since last checkpoint
b) checkpoint age is greater than max_checkpoint_age_async
c) it was requested to have greater checkpoint_lsn,
             and oldest_lsn allows to satisfy the request

G. 參考：

MySQL8.0.16源代碼

MySQL8.0 - 新特性 - 說說InnoDB Log System的隐藏參數

繼續閱讀

mariadb資料庫（一）

deploy zabbix 2.2.10 on ubuntu 14.04 64bitInstall dependsInstalling Zabbix daemonsInstalling Zabbix web interfaceISSUES

韓順平Linux筆記（十二）——程序的概念和管理

JavaMelody監控

OLAP型資料庫是一種強大的資料分析工具，能夠處理複雜的多元資料分析和查詢。什麼是OLAP型資料庫呢？OLAP型資料庫是

8月1日，NineData釋出對Oracle資料庫的全版本支援。衆所周知，Oracle資料庫是一款全球領先的關系型資料庫

鍊路追蹤之sleuth全生命周期分析

監控技術選型

zabbix5.0實戰監控Tomcatzabbix5.0實戰監控Tomcat

Redis_01_Redis安裝與使用

zabbix監控Rabbitmq（pyhon 自動發現隊列和監控内容）

【監控】JavaMelody In ActionJavaMelody In Action

spring data JPA中的主鍵政策

zabbix4.0監控php-fpm

Zabbix3.4監控Redis

十四、MySQL備份和恢複資料庫1、備份和恢複的方法2、使用mysqldump導出3、恢複轉儲檔案4、字元編碼問題5、鎖表系列結語