天天看點

離線部署系列文章之二:TiDB叢集更新(5.3.0->5.4.2)&縮擴容 TiDB Server、PD、TiKV、TiFlash

作者: OnTheRoad ​

本文檔的部署路線圖為:

  1. 離線部署 TiDB v5.3.0(​

    ​TiDB*3、PD*3、TiKV*3​

    ​);
  2. 源碼部署 Haproxy v2.5.0與使用者管理
  3. 離線更新 TiDB v5.3.0 至 TiDB v5.4.2;
  4. 縮擴容 TiDB Server、PD、TiKV、TiFlash
  5. 部署 TiSpark(​

    ​TiSpark*3​

    ​)
  6. 離線更新 TiDB v5.4.2 至 TiDB v6.1

3. TiDB叢集更新

3.1. 更新至 5.4.x 版本

更新文檔可參考官網連結: ​​https://docs.pingcap.com/zh/tidb/v5.4/upgrade-tidb-using-tiup​​

3.1.1. 5.4.x 關鍵特性

發版日期:2022 年 2 月 15 日,5.4.0 關鍵特性如下:

  1. 支援 GBK 字元集
  2. 支援索引合并 (Index Merge) 資料通路方法,能夠合并多個列上索引的條件過濾結果
  3. 支援通過 session 變量實作有界限過期資料讀取
  4. 支援統計資訊采集配置持久化
  5. 支援使用 Raft Engine 作為 TiKV 的日志存儲引擎【實驗特性】
  6. 優化備份對叢集的影響
  7. 支援 Azure Blob Storage 作為備份目标存儲
  8. 持續提升 TiFlash 列式存儲引擎和 MPP 計算引擎的穩定性和性能
  9. 為 TiDB Lightning 增加已存在資料表是否允許導入的開關
  10. 優化持續性能分析【實驗特性】
  11. TiSpark 支援使用者認證與鑒權

3.1.2. 相容性

變量名 修改類型 描述
​​tidb_enable_column_tracking​​ 新增 用于控制是否開啟 TiDB 對 ​

​PREDICATE COLUMNS​

​​ 的收集,預設值為 ​

​OFF​

​。
​​tidb_enable_paging​​ 新增 此變量用于控制 ​

​IndexLookUp​

​​ 算子是否使用分頁 (paging) 方式發送 Coprocessor 請求,預設值為 ​

​OFF​

​​。對于使用 ​

​IndexLookUp​

​​ 和 ​

​Limit​

​​ 并且 ​

​Limit​

​​ 無法下推到 ​

​IndexScan​

​​ 上的讀請求,可能會出現讀請求的延遲高、TiKV 的 Unified read pool CPU 使用率高的情況。在這種情況下,由于 ​

​Limit​

​​ 算子隻需要少部分資料,開啟 ​

​tidb_enable_paging​

​,能夠減少處理資料的數量,進而降低延遲、減少資源消耗。
​​tidb_enable_top_sql​​ 新增 用于控制是否開啟 Top SQL 特性,預設值為 OFF。
​​tidb_persist_analyze_options​​ 新增 用于控制是否開啟 ​​ANALYZE 配置持久化​​​特性,預設值為 ​

​ON​

​。
​​tidb_read_staleness​​ 新增 用于設定目前會話允許讀取的曆史資料範圍,預設值為 ​

​0​

​。
​​tidb_regard_null_as_point​​ 新增 用于控制優化器是否可以将包含 null 的等值條件作為字首條件來通路索引。
​​tidb_stats_load_sync_wait​​ 新增 這個變量用于控制是否開啟統計資訊的同步加載模式(預設為 ​

​0​

​ 代表不開啟,即為異步加載模式),以及開啟的情況下,SQL 執行同步加載完整統計資訊等待多久後會逾時。
​​tidb_stats_load_pseudo_timeout​​ 新增 用于控制統計資訊同步加載逾時後,SQL 是執行失敗 (​

​OFF​

​​) 還是退回使用 pseudo 的統計資訊 (​

​ON​

​​),預設值為 ​

​OFF​

​。
​​tidb_backoff_lock_fast​​ 修改 預設值由 ​

​100​

​​ 修改為 ​

​10​

​。
​​tidb_enable_index_merge​​ 修改 預設值由 ​

​OFF​

​​ 改為 ​

​ON​

​​。如果從低于 v4.0.0 版本更新到 v5.4.0 及以上版本的叢集,該變量值預設保持 ​

​OFF​

​​。如果從 v4.0.0 及以上版本更新到 v5.4.0 及以上版本的叢集,該變量開關保持更新前的狀态。對于 v5.4.0 及以上版本的建立叢集,該變量開關預設保持 ​

​ON​

​。
​​tidb_store_limit​​ 修改 v5.4.0 前支援執行個體級别及叢集級别的設定,現在隻支援叢集級别的設定。

3.2. 更新前準備

3.2.1. 更新 TiUP 離線鏡像

可參考 ​

​1.5.1. 部署TiUP元件​

​​,部署新版 TiUP 離線鏡像。上傳到中控機。在執行 ​

​local_install.sh​

​​ 後,TiUP 會執行 ​

​tiup mirror set tidb-community-server-$version-linux-amd64​

​ 指定新版離線鏡像源。

離線鏡像包下載下傳位址 ​​https://pingcap.com/zh/product-community​​

~]$ id
uid=1000(tidb) gid=1000(tidb) groups=1000(tidb)

~]$ tar -xzvf tidb-community-server-v5.4.2-linux-amd64.tar.gz
~]$ sh tidb-community-server-v5.4.2-linux-amd64/local_install.sh
~]$ source /home/tidb/.bash_profile

~]$ tiup update cluster
Updated successfully!      

此時離線鏡像已經更新成功。如果覆寫後發現 TiUP 運作報錯,可嘗試 ​

​rm -rf ~/.tiup/manifests/*​

​ 後再使用。

3.2.2. 修改存在沖突的配置項

通過指令 ​

​tiup cluster edit-config <叢集名>​

​ 載入 TiDB 叢集配置,修改存在沖突的配置項。若原叢集未修改過預設的配置參數,可忽略此步驟。

~]$ tiup cluster edit-config kruidb-cluster      
注意以下 TiKV 參數在 TiDB v5.0 已廢棄。如果在原叢集配置過以下參數,需要通過 edit-config 編輯模式删除這些參數:
  1. pessimistic-txn.enabled
  2. server.request-batch-enable-cross-command
  3. server.request-batch-wait-duration

3.2.3. 叢集健康檢查

更新前,通過 ​

​tiup cluster check <叢集名> --cluster​

​ 對叢集目前的 region 健康狀态進行檢查。

~]$ tiup cluster check kruidb-cluster --cluster

...
192.168.3.225  cpu-governor  Warn    Unable to determine current CPU frequency governor policy
192.168.3.225  memory        Pass    memory size is 4096MB
Checking region status of the cluster kruidb-cluster...
All regions are healthy.      

如果結果為 “​

​All regions are healthy​

​”,則說明目前叢集中所有 region 均為健康狀态,可以繼續執行更新;

如果結果為 “​

​Regions are not fully healthy: m miss-peer, n pending-peer​

​​” 并提示 “​

​Please fix unhealthy regions before other operations.​

​”,則說明目前叢集中有 region 處在異常狀态,應先排除相應異常狀态。

3.3. 更新叢集

TiUP Cluster 包括不停機更新與停機更新兩種方式。

預設為不停機更新,即更新過程中叢集仍然可以對外提供服務。更新時會對各 TiKV 節點逐個遷移 Leader 後再更新和重新開機,是以對于大規模叢集需要較長時間才能完成整個更新操作。

停機更新則避免了排程 Leader 的過程,若業務可停機,則可以使用停機更新的方式快速進行更新操作。

3.3.1. 停機更新

# 1. 關閉 TiDB 叢集
~]$ tiup cluster stop kruidb-cluster

# 2. 更新 TiDB 叢集
~]$ tiup cluster upgrade kruidb-cluster v5.4.2 --offline

# 3. 啟動 TiDB 叢集
~]$ tiup cluster start kruidb-cluster      

3.3.2. 不停機更新

# 不停機更新 TiDB 叢集
~]$ tiup cluster upgrade kruidb-cluster v5.4.2

tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster upgrade kruidb-cluster v5.4.2
This operation will upgrade tidb v5.3.0 cluster kruidb-cluster to v5.4.2.
Do you want to continue? [y/N]:(default=N)y

......
Upgrading component pd
        Restarting instance 192.168.3.221:2379
        Restart instance 192.168.3.221:2379 success
        Restarting instance 192.168.3.222:2379
        Restart instance 192.168.3.222:2379 success
        Restarting instance 192.168.3.223:2379
        Restart instance 192.168.3.223:2379 success
Upgrading component tikv
        Evicting 4 leaders from store 192.168.3.224:20160...
          Still waitting for 4 store leaders to transfer...
          Still waitting for 4 store leaders to transfer...         
          ......
        Restarting instance 192.168.3.224:20160   
Upgrading component tidb
        Restarting instance 192.168.3.221:4000
        ......
        
Starting component blackbox_exporter        
        Start 192.168.3.221 success
        ......
Upgraded cluster `kruidb-cluster` successfully                

更新 TiKV 期間,會逐個将 TiKV 上的所有 Leader 切走再停止該 TiKV 執行個體。預設逾時時間為 5 分鐘(300 秒),逾時後會直接停止該執行個體。可通過 ​

​--transfer-timeout​

​​ 将逾時時間指定為一個更大的值,如 ​

​--transfer-timeout 3600​

​,機關為秒。

注意若想将 TiFlash 從 5.3 之前的版本更新到 5.3 及之後的版本,必須進行 TiFlash 的停機更新。步驟如下:

​# 1. 關閉 TiFlash 執行個體 ~]$ tiup cluster stop kruidb-cluster -R tiflash # 2. --offline 以不重新開機的方式,更新 TiDB 叢集 ~]$ tiup cluster upgrade kruidb-cluster v5.4.2 --offline # 3. reload 叢集,TiFlash 也會正常啟動 ~]$ tiup cluster reload kruidb-cluster ​

3.4. 更新驗證

~]$ tiup cluster display kruidb-cluster

tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster display kruidb-cluster
Cluster type:       tidb
Cluster name:       kruidb-cluster
Cluster version:    v5.4.2
Deploy user:        tidb
SSH type:           builtin
Dashboard URL:      http://192.168.3.222:2379/dashboard
Grafana URL:        http://192.168.3.221:3000
......      

3.5. 更新FAQ

3.5.1. 更新中斷後繼續更新

更新報錯中斷,排錯後重新執行 ​

​tiup cluster upgrade​

​ 指令,繼續更新。

若不希望重新開機已更新過的節點,可按如下步驟進行。

  1. 确定失敗的節點 ID,記為​

    ​<Audit ID>​

~]$ tiup cluster audit

tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster audit
ID           Time                       Command
--           ----                       -------
fWDnXxZpQ5G  2022-07-25T17:02:32+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster template
fWDnZLRQttJ  2022-07-25T17:03:11+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster template
fWDp44XHFw7  2022-07-25T17:04:27+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster template
fWDpyj6Qbcq  2022-07-25T17:11:33+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster check ./topology.yaml --user tidb
fWDpKg3hbwg  2022-07-25T17:14:11+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster check ./topology.yaml --apply --user root
fWDpNrc8pn1  2022-07-25T17:15:06+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster check ./topology.yaml --user tidb
fWDq5SPjQsW  2022-07-25T17:19:56+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster check ./topology.yaml --user tidb
fWDqcJwFnB3  2022-07-25T17:21:38+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster check ./topology.yaml --user tidb
fWDqsr5r9zF  2022-07-25T17:25:05+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster check ./topology.yaml --user tidb
fWDr9dxMr6F  2022-07-25T17:35:52+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster check ./topology.yaml --user tidb
fWDrH4pJjpm  2022-07-25T17:43:27+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster deploy kruidb-cluster v5.3.0 ./topology.yaml --user tidb
fWDrMwhrcL3  2022-07-25T17:44:45+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster display kruidb-cluster
fWDrQCMcGdM  2022-07-25T17:45:40+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster start kruidb-cluster
fWDrSX3Djmk  2022-07-25T17:46:20+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster display kruidb-cluster
fWDs1sMGK7m  2022-07-25T17:48:33+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster edit-config kruidb-cluster
fWDs6Tk2kdB  2022-07-25T17:50:08+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster list
fWDMzrPWZ21  2022-07-25T21:56:04+08:00  /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster display kruidb-cluster
fWGm3DMvvkR  2022-07-26T18:00:00+08:00  /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster edit-config kruidb-cluster
fWGm48bVhDw  2022-07-26T18:00:09+08:00  /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster check kruidb-cluster --cluster
fWGp8JYqVFL  2022-07-26T18:31:24+08:00  /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster upgrade kruidb-cluster v5.4.2
fWGpwx1834M  2022-07-26T18:36:38+08:00  /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster display kruidb-cluster      
  1. 重試失敗的節點
~]$ tiup cluster replay <Audit ID>      

3.5.2. evict leader 等待時間過長

~]$ tiup cluster upgrade kruidb-cluster v5.4.2 --force      
注意​

​--force​

​ 參數可以不驅逐 Leader,直接快速更新叢集至新版本,但是該方式會忽略所有更新中的錯誤,在更新失敗後得不到有效提示,需謹慎使用。

3.5.3. 更新 pd-ctl 等周邊工具版本

通過 TiUP 安裝對應版本的 ctl 元件來更新相關工具版本。

~]$ tiup install ctl:v5.4.2      
~]$ tiup list --installed --verbose

Available components:
Name     Owner    Installed       Platforms    Description
----     -----    ---------       ---------    -----------
bench    pingcap  v1.7.0          linux/amd64  Benchmark database with different workloads
cluster  pingcap  v1.10.2,v1.7.0  linux/amd64  Deploy a TiDB cluster for production
ctl      pingcap  v5.4.2          linux/amd64  TiDB controller suite      

關于 TiUP 元件的使用,可參考官網 ​​https://docs.pingcap.com/zh/tidb/v5.4/tiup-component-management​​

4. 擴縮容TiDB/PD/TiKV/TiFlash

4.1. 擴容 TiDB/PD/TiKV

4.1.1. 節點配置

  1. 按​

    ​1.3 主機配置​

    ​ 章節,為待擴容節點建立 tidb 使用者、免密登入、系統優化等。

4.1.2. 節點配置檔案

編輯擴容配置檔案 tidb-scale-out.yaml,添加擴容的 TiDB 配置參數。可通過 ​

​tiup cluster edit-config <叢集名>​

​ 載入已有的配置資訊,對照填寫。

  • TiDB Server 配置檔案
~]$ cat tidb-scale-out.yaml
tidb_servers:
  - host: 192.168.3.227      
  • PD 配置檔案
~]$ cat pd-scale-out.yaml
pd_servers:
  - host: 192.168.3.228      
  • TiKV 配置檔案
~]$ cat tikv-scale-out.yaml
tikv_servers:
  - host: 192.168.3.229      

這裡為節省時間,同時擴容三類(TiDB、PD、TiKV)節點,準備擴容配置檔案 ​

​scale-out.yaml​

​ 内容如下:

pd_servers:
  - host: 192.168.3.228
tidb_servers:
  - host: 192.168.3.227
tikv_servers:
  - host: 192.168.3.229      

生産環境擴容,建議針對沒類節點分别擴容。

4.1.3. 擴容檢查

  • 擴容檢查

以擴容 TiDB(192.168.3.227)為例。

~]$ tiup cluster check kruidb-cluster scale-out.yaml --cluster

Node           Check         Result  Message
----           -----         ------  -------
192.168.3.228  selinux       Pass    SELinux is disabled
192.168.3.228  thp           Pass    THP is disabled
192.168.3.228  command       Pass    numactl: policy: default
192.168.3.228  os-version    Pass    OS is CentOS Linux 7 (Core) 7.9.2009
192.168.3.228  cpu-cores     Pass    number of CPU cores / threads: 4
192.168.3.228  cpu-governor  Warn    Unable to determine current CPU frequency governor policy
192.168.3.228  memory        Pass    memory size is 4096MB
192.168.3.229  cpu-governor  Warn    Unable to determine current CPU frequency governor policy
192.168.3.229  memory        Pass    memory size is 4096MB
192.168.3.229  selinux       Pass    SELinux is disabled
192.168.3.229  thp           Pass    THP is disabled
192.168.3.229  command       Pass    numactl: policy: default
192.168.3.229  timezone      Pass    time zone is the same as the first PD machine: America/New_York
192.168.3.229  os-version    Pass    OS is CentOS Linux 7 (Core) 7.9.2009
192.168.3.229  cpu-cores     Pass    number of CPU cores / threads: 4
192.168.3.227  memory        Pass    memory size is 4096MB
192.168.3.227  selinux       Pass    SELinux is disabled
192.168.3.227  thp           Pass    THP is disabled
192.168.3.227  command       Pass    numactl: policy: default
192.168.3.227  timezone      Pass    time zone is the same as the first PD machine: America/New_York
192.168.3.227  os-version    Pass    OS is CentOS Linux 7 (Core) 7.9.2009
192.168.3.227  cpu-cores     Pass    number of CPU cores / threads: 4
192.168.3.227  cpu-governor  Warn    Unable to determine current CPU frequency governor policy      
  • 風險修複

應用如下指令,可修複大部分的風險。針對無法自動修複的風險,可手動修複。如下示例,需手動安裝 numactl 包。

~]$ tiup cluster check kruidb-cluster scale-out.yaml --cluster --apply --user root -p

192.168.3.228  memory        Pass    memory size is 4096MB
192.168.3.228  selinux       Pass    SELinux is disabled
192.168.3.228  thp           Pass    THP is disabled
192.168.3.228  command       Pass    numactl: policy: default
+ Try to apply changes to fix failed checks
  - Applying changes on 192.168.3.229 ... Done
  - Applying changes on 192.168.3.227 ... Done
  - Applying changes on 192.168.3.228 ... Done      

4.1.4. 執行擴容

  1. 執行擴容 TiDB
~]$ tiup cluster scale-out kruidb-cluster scale-out.yaml

tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster scale-out kruidb-cluster scale-out.yaml

+ Detect CPU Arch Name
  - Detecting node 192.168.3.228 Arch info ... Done
  - Detecting node 192.168.3.229 Arch info ... Done
  - Detecting node 192.168.3.227 Arch info ... Done

+ Detect CPU OS Name
  - Detecting node 192.168.3.228 OS info ... Done
  - Detecting node 192.168.3.229 OS info ... Done
  - Detecting node 192.168.3.227 OS info ... Done
Please confirm your topology:
Cluster type:    tidb
Cluster name:    kruidb-cluster
Cluster version: v5.4.2
Role  Host           Ports        OS/Arch       Directories
----  ----           -----        -------       -----------
pd    192.168.3.228  2379/2380    linux/x86_64  /tidb-deploy/pd-2379,/tidb-data/pd-2379
tikv  192.168.3.229  20160/20180  linux/x86_64  /tidb-deploy/tikv-20160,/tidb-data/tikv-20160
tidb  192.168.3.227  4000/10080   linux/x86_64  /tidb-deploy/tidb-4000
Attention:
    1. If the topology is not what you expected, check your yaml file.
    2. Please confirm there is no port/directory conflicts in same host.
Do you want to continue? [y/N]: (default=N) y

......
+ Refresh components conifgs
  ......
  - Generate config prometheus -> 192.168.3.221:9090 ... Done
  - Generate config grafana -> 192.168.3.221:3000 ... Done
  - Generate config alertmanager -> 192.168.3.221:9093 ... Done
+ Reload prometheus and grafana
  - Reload prometheus -> 192.168.3.221:9090 ... Done
  - Reload grafana -> 192.168.3.221:3000 ... Done
+ [ Serial ] - UpdateTopology: cluster=kruidb-cluster
Scaled cluster `kruidb-cluster` out successfully      
  1. 檢查叢集狀态
~]$ tiup cluster display kruidb-cluster
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster display kruidb-cluster
Cluster type:       tidb
Cluster name:       kruidb-cluster
Cluster version:    v5.4.2
Deploy user:        tidb
SSH type:           builtin
Dashboard URL:      http://192.168.3.222:2379/dashboard
Grafana URL:        http://192.168.3.221:3000
ID                   Role          Host           Ports        OS/Arch       Status  Data Dir                      Deploy Dir
--                   ----          ----           -----        -------       ------  --------                      ----------
192.168.3.221:9093   alertmanager  192.168.3.221  9093/9094    linux/x86_64  Up      /tidb-data/alertmanager-9093  /tidb-deploy/alertmanager-9093
192.168.3.221:3000   grafana       192.168.3.221  3000         linux/x86_64  Up      -                             /tidb-deploy/grafana-3000
192.168.3.221:2379   pd            192.168.3.221  2379/2380    linux/x86_64  Up      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.222:2379   pd            192.168.3.222  2379/2380    linux/x86_64  Up|UI   /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.223:2379   pd            192.168.3.223  2379/2380    linux/x86_64  Up|L    /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.228:2379   pd            192.168.3.228  2379/2380    linux/x86_64  Up      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.221:9090   prometheus    192.168.3.221  9090/12020   linux/x86_64  Up      /tidb-data/prometheus-9090    /tidb-deploy/prometheus-9090
192.168.3.221:4000   tidb          192.168.3.221  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.222:4000   tidb          192.168.3.222  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.223:4000   tidb          192.168.3.223  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.227:4000   tidb          192.168.3.227  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.224:20160  tikv          192.168.3.224  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.225:20160  tikv          192.168.3.225  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.226:20160  tikv          192.168.3.226  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.229:20160  tikv          192.168.3.229  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
Total nodes: 15      
  1. 為 Haproxy 增加擴容的 TiDB 節點
~]# echo "server tidb-4 192.168.3.228:4000 check inter 2000 rise 2 fall 3" >> /etc/haproxy/haproxy.cfg
~]# systemctl stop haproxy
~]# systemctl start haproxy      

4.2. 縮容TiDB/PD/TiKV

​tiup cluster scale-in​

​ 指令用于 TiDB 叢集的縮容操作。TiDB 針對不同節點的縮容,進行不同的處理:

  1. 對 TiKV,TiFlash 及 TiDB Binlog 元件的操作:
  • ​tiup-cluster​

    ​​ 通過 API 将 TiKV,TiFlash 及 TiDB Binlog 下線後,直接退出而不等待下線完成。TiKV,TiFlash 及 TiDB Binlog 元件異步下線完成後,狀态變為​

    ​Tombstone​

  • ​tiup cluster display​

    ​ 檢視下線節點的狀态,等待其狀态變為 Tombstone。
  • ​tiup cluster prune​

    ​ 指令清理 Tombstone 節點。該指令會停止已下線的節點的服務;清理已經下線掉的節點的相關資料檔案;更新叢集的拓撲,移除已經下線掉的節點。
  1. 對其他元件的操作
  • 下線 PD 元件時,會通過 API 将指定節點從叢集中删除掉(這個過程很快),然後停掉指定 PD 的服務并且清除該節點的相關資料檔案;
  • 下線其他元件時,直接停止并且清除節點的相關資料檔案

4.2.1. 縮容 TiDB/PD

若叢集應用了 Haproxy,需先修改 Haproxy 配置,路徑為 ​

​/etc/haproxy/haprox.cfg​

​,删除待縮容的 TiDB 節點,并重新開機 Haproxy 服務。

  1. 檢視節點 ID 資訊
~]$ tiup cluster display kruidb-cluster 
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster display kruidb-cluster
Cluster type:       tidb
Cluster name:       kruidb-cluster
Cluster version:    v5.4.2
Deploy user:        tidb
SSH type:           builtin
Dashboard URL:      http://192.168.3.222:2379/dashboard
Grafana URL:        http://192.168.3.221:3000
ID                   Role          Host           Ports        OS/Arch       Status  Data Dir                      Deploy Dir
--                   ----          ----           -----        -------       ------  --------                      ----------
192.168.3.221:9093   alertmanager  192.168.3.221  9093/9094    linux/x86_64  Up      /tidb-data/alertmanager-9093  /tidb-deploy/alertmanager-9093
192.168.3.221:3000   grafana       192.168.3.221  3000         linux/x86_64  Up      -                             /tidb-deploy/grafana-3000
192.168.3.221:2379   pd            192.168.3.221  2379/2380    linux/x86_64  Up      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.222:2379   pd            192.168.3.222  2379/2380    linux/x86_64  Up|UI   /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.223:2379   pd            192.168.3.223  2379/2380    linux/x86_64  Up|L    /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.228:2379   pd            192.168.3.228  2379/2380    linux/x86_64  Up      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.221:9090   prometheus    192.168.3.221  9090/12020   linux/x86_64  Up      /tidb-data/prometheus-9090    /tidb-deploy/prometheus-9090
192.168.3.221:4000   tidb          192.168.3.221  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.222:4000   tidb          192.168.3.222  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.223:4000   tidb          192.168.3.223  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.227:4000   tidb          192.168.3.227  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.224:20160  tikv          192.168.3.224  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.225:20160  tikv          192.168.3.225  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.226:20160  tikv          192.168.3.226  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160      
  1. 執行縮容

以同時縮容 ID 為 ​

​192.168.3.227:4000​

​​ 的 TiDB 節點和 ID 為 ​

​192.168.3.228:2379​

​ 的 PD 節點為例。生産環境建議每個節點單獨縮容。

~]$ tiup cluster scale-in kruidb-cluster --node 192.168.3.227:4000 --node 192.168.3.228:2379 --node 192.168.3.229:20160
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster scale-in kruidb-cluster --node 192.168.3.227:4000 --node 192.168.3.228:2379 --node 192.168.3.229:20160
This operation will delete the 192.168.3.227:4000,192.168.3.228:2379,192.168.3.229:20160 nodes in `kruidb-cluster` and all their data.
Do you want to continue? [y/N]:(default=N) y
The component `[tikv]` will become tombstone, maybe exists in several minutes or hours, after that you can use the prune command to clean it
Do you want to continue? [y/N]:(default=N) y
Scale-in nodes...

...
+ Reload prometheus and grafana
  - Reload prometheus -> 192.168.3.221:9090 ... Done
  - Reload grafana -> 192.168.3.221:3000 ... Done
Scaled cluster `kruidb-cluster` in successfully      
  1. 檢查叢集狀态
~]$ tiup cluster display kruidb-cluster
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster display kruidb-cluster
Cluster type:       tidb
Cluster name:       kruidb-cluster
Cluster version:    v5.4.2
Deploy user:        tidb
SSH type:           builtin
Dashboard URL:      http://192.168.3.222:2379/dashboard
Grafana URL:        http://192.168.3.221:3000
ID                   Role          Host           Ports        OS/Arch       Status           Data Dir                      Deploy Dir
--                   ----          ----           -----        -------       ------           --------                      ----------
192.168.3.221:9093   alertmanager  192.168.3.221  9093/9094    linux/x86_64  Up               /tidb-data/alertmanager-9093  /tidb-deploy/alertmanager-9093
192.168.3.221:3000   grafana       192.168.3.221  3000         linux/x86_64  Up               -                             /tidb-deploy/grafana-3000
192.168.3.221:2379   pd            192.168.3.221  2379/2380    linux/x86_64  Up               /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.222:2379   pd            192.168.3.222  2379/2380    linux/x86_64  Up|UI            /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.223:2379   pd            192.168.3.223  2379/2380    linux/x86_64  Up|L             /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.221:9090   prometheus    192.168.3.221  9090/12020   linux/x86_64  Up               /tidb-data/prometheus-9090    /tidb-deploy/prometheus-9090
192.168.3.221:4000   tidb          192.168.3.221  4000/10080   linux/x86_64  Up               -                             /tidb-deploy/tidb-4000
192.168.3.222:4000   tidb          192.168.3.222  4000/10080   linux/x86_64  Up               -                             /tidb-deploy/tidb-4000
192.168.3.223:4000   tidb          192.168.3.223  4000/10080   linux/x86_64  Up               -                             /tidb-deploy/tidb-4000
192.168.3.224:20160  tikv          192.168.3.224  20160/20180  linux/x86_64  Up               /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.225:20160  tikv          192.168.3.225  20160/20180  linux/x86_64  Up               /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.226:20160  tikv          192.168.3.226  20160/20180  linux/x86_64  Up               /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.229:20160  tikv          192.168.3.229  20160/20180  linux/x86_64  Pending Offline  /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
Total nodes: 13      
  1. 清理 Tombstone 節點

待 TiKV 節點由 ​

​Pending Offline​

​​ 狀态,轉變為 ​

​Tombstone​

​​ 狀态後,即可執行 ​

​tiup cluster prune <叢集名>​

​ 清理已下線的 TiKV節點,更新叢集拓撲。

~]$ tiup cluster prune kruidb-cluster

tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster prune kruidb-cluster
+ [ Serial ] - SSHKeySet: privateKey=/home/tidb/.tiup/storage/cluster/clusters/kruidb-cluster/ssh/id_rsa, publicKey=/home/tidb/.tiup/storage/cluster/clusters/kruidb-cluster/ssh/id_rsa.pub
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.225
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.226
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.222
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.229
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.223
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.222
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.221
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.224
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.223
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.221
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.221
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.221
+ [Parallel] - UserSSH: user=tidb, host=192.168.3.221
+ [ Serial ] - FindTomestoneNodes
Will destroy these nodes: [192.168.3.229:20160]
Do you confirm this action? [y/N]:(default=N) y 
Start destroy Tombstone nodes: [192.168.3.229:20160] ...
......
+ Reload prometheus and grafana
  - Reload prometheus -> 192.168.3.221:9090 ... Done
  - Reload grafana -> 192.168.3.221:3000 ... Done
Destroy success      
  1. 檢查叢集狀态
~]$ tiup cluster display kruidb-cluster
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster display kruidb-cluster
Cluster type:       tidb
Cluster name:       kruidb-cluster
Cluster version:    v5.4.2
Deploy user:        tidb
SSH type:           builtin
Dashboard URL:      http://192.168.3.222:2379/dashboard
Grafana URL:        http://192.168.3.221:3000
ID                   Role          Host           Ports        OS/Arch       Status  Data Dir                      Deploy Dir
--                   ----          ----           -----        -------       ------  --------                      ----------
192.168.3.221:9093   alertmanager  192.168.3.221  9093/9094    linux/x86_64  Up      /tidb-data/alertmanager-9093  /tidb-deploy/alertmanager-9093
192.168.3.221:3000   grafana       192.168.3.221  3000         linux/x86_64  Up      -                             /tidb-deploy/grafana-3000
192.168.3.221:2379   pd            192.168.3.221  2379/2380    linux/x86_64  Up      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.222:2379   pd            192.168.3.222  2379/2380    linux/x86_64  Up|UI   /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.223:2379   pd            192.168.3.223  2379/2380    linux/x86_64  Up|L    /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.221:9090   prometheus    192.168.3.221  9090/12020   linux/x86_64  Up      /tidb-data/prometheus-9090    /tidb-deploy/prometheus-9090
192.168.3.221:4000   tidb          192.168.3.221  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.222:4000   tidb          192.168.3.222  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.223:4000   tidb          192.168.3.223  4000/10080   linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.224:20160  tikv          192.168.3.224  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.225:20160  tikv          192.168.3.225  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.226:20160  tikv          192.168.3.226  20160/20180  linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
Total nodes: 12      

4.3. 擴容 TiFlash

4.3.1. 擴容 TiFlash 步驟

在原有叢集上新增 TiFlash 元件,需要確定 TiDB 叢集版本為 v5.0 以上,并且需要開啟 PD 的 Placement Rules(5.0及以上預設開啟) 功能。

  1. 确認開啟 PD 的 Placement Rules

進入 pd-ctl 互動模式檢視 placement-rules 啟用狀态。

~]$ tiup ctl:v5.4.2 pd -u http://192.168.3.222:2379 -i
Starting component `ctl`: /home/tidb/.tiup/components/ctl/v5.4.2/ctl pd -u http://192.168.3.222:2379 -i
» config show replication
{
  "max-replicas": 3,
  "location-labels": "",
  "strictly-match-label": "false",
  "enable-placement-rules": "true",
  "enable-placement-rules-cache": "false",
  "isolation-level": ""
}      

若未開啟,可在 pd-ctl 互動模式中執行 ​

​config set enable-placement-rules true​

​ 開啟 Placement Rules。也可通過 tiup 元件調用 pd-ctl 開啟 Placement Rules。

~]$ tiup ctl:v5.4.2 pd -u http://192.168.3.222:2379 -i
>> config set enable-placement-rules true      
~]$ tiup ctl:v5.4.2 pd -u http://192.168.3.222:2379 config set enable-placement-rules true      
  1. 編輯 TiFlash 節點配置檔案 tiflash-out.yaml
~]$ cat tiflash-out.yaml
tiflash_servers:
  - host: 192.168.3.228
  - host: 192.168.3.229      
  1. 擴容檢查及修複
~]$ tiup cluster check kruidb-cluster tiflash-out.yaml --cluster

~]$ tiup cluster check kruidb-cluster tiflash-out.yaml --cluster --apply --user root -p      
  1. 執行擴容
~]$ tiup cluster scale-out kruidb-cluster tiflash-out.yaml
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster scale-out kruidb-cluster tiflash-out.yaml

+ Detect CPU Arch Name
  - Detecting node 192.168.3.228 Arch info ... Done
  - Detecting node 192.168.3.229 Arch info ... Done

+ Detect CPU OS Name
  - Detecting node 192.168.3.228 OS info ... Done
  - Detecting node 192.168.3.229 OS info ... Done
Please confirm your topology:
Cluster type:    tidb
Cluster name:    kruidb-cluster
Cluster version: v5.4.2
Role     Host           Ports                            OS/Arch       Directories
----     ----           -----                            -------       -----------
tiflash  192.168.3.228  9000/8123/3930/20170/20292/8234  linux/x86_64  /tidb-deploy/tiflash-9000,/tidb-data/tiflash-9000
tiflash  192.168.3.229  9000/8123/3930/20170/20292/8234  linux/x86_64  /tidb-deploy/tiflash-9000,/tidb-data/tiflash-9000
Attention:
    1. If the topology is not what you expected, check your yaml file.
    2. Please confirm there is no port/directory conflicts in same host.
Do you want to continue? [y/N]: (default=N) y 
......
+ Reload prometheus and grafana
  - Reload prometheus -> 192.168.3.221:9090 ... Done
  - Reload grafana -> 192.168.3.221:3000 ... Done
+ [ Serial ] - UpdateTopology: cluster=kruidb-cluster
Scaled cluster `kruidb-cluster` out successfully      
  1. 檢查叢集
~]$ tiup cluster display kruidb-cluster

tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster display kruidb-cluster
Cluster type:       tidb
Cluster name:       kruidb-cluster
Cluster version:    v5.4.2
Deploy user:        tidb
SSH type:           builtin
Dashboard URL:      http://192.168.3.222:2379/dashboard
Grafana URL:        http://192.168.3.221:3000
ID                   Role          Host           Ports                            OS/Arch       Status  Data Dir                      Deploy Dir
--                   ----          ----           -----                            -------       ------  --------                      ----------
192.168.3.221:9093   alertmanager  192.168.3.221  9093/9094                        linux/x86_64  Up      /tidb-data/alertmanager-9093  /tidb-deploy/alertmanager-9093
192.168.3.221:3000   grafana       192.168.3.221  3000                             linux/x86_64  Up      -                             /tidb-deploy/grafana-3000
192.168.3.221:2379   pd            192.168.3.221  2379/2380                        linux/x86_64  Up      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.222:2379   pd            192.168.3.222  2379/2380                        linux/x86_64  Up|UI   /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.223:2379   pd            192.168.3.223  2379/2380                        linux/x86_64  Up|L    /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.221:9090   prometheus    192.168.3.221  9090/12020                       linux/x86_64  Up      /tidb-data/prometheus-9090    /tidb-deploy/prometheus-9090
192.168.3.221:4000   tidb          192.168.3.221  4000/10080                       linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.222:4000   tidb          192.168.3.222  4000/10080                       linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.223:4000   tidb          192.168.3.223  4000/10080                       linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.228:9000   tiflash       192.168.3.228  9000/8123/3930/20170/20292/8234  linux/x86_64  Up      /tidb-data/tiflash-9000       /tidb-deploy/tiflash-9000
192.168.3.229:9000   tiflash       192.168.3.229  9000/8123/3930/20170/20292/8234  linux/x86_64  Up      /tidb-data/tiflash-9000       /tidb-deploy/tiflash-9000
192.168.3.224:20160  tikv          192.168.3.224  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.225:20160  tikv          192.168.3.225  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.226:20160  tikv          192.168.3.226  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
Total nodes: 14      

4.3.2. 列存驗證

  1. 建立測試表
~]$ mysql -uroot -h 192.168.3.221 -P 4000 -proot

mysql> use test;
Database changed

mysql> create table t_test(id int, name varchar(32));
Query OK, 0 rows affected (0.55 sec)

mysql> insert into t_test values(1,'zhang3');
Query OK, 1 row affected (0.03 sec)      
  1. 為測試表添加 TiFlash 列存副本
mysql> alter table test.t_test set tiflash replica 2;
Query OK, 0 rows affected (0.51 sec)      

也可按庫建立 TiFlash 列存副本,文法為 ​

​alter table <資料庫名> set tiflash replica <副本數>;​

  1. 檢視列存副本同步進度
mysql> select table_schema,table_name,replica_count,progress from information_schema.tiflash_replica;
+--------------+------------+---------------+----------+
| table_schema | table_name | replica_count | progress |
+--------------+------------+---------------+----------+
| test         | t_test     |             2 |        1 |
+--------------+------------+---------------+----------+
1 row in set (0.01 sec)      

AVAILABLE 字段表示該表的 TiFlash 副本是否可用。1 代表可用,0 代表不可用。副本狀态為可用之後就不再改變,如果通過 DDL 指令修改副本數則會重新計算同步進度。

PROGRESS 字段代表同步進度,在 0.0~1.0 之間,1 代表至少 1 個副本已經完成同步。

4.4. 縮容 TiFlash

4.4.1. 調整列存副本數

在縮容 TiFlash 節點之前,需確定 TiFlash 叢集剩餘節點數大于等于所有資料表的最大副本數,否則需要修改相關表的副本數。

~]$ mysql -uroot -h 192.168.3.221 -P 4000 -proot

mysql> SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 't_test';
+--------------+------------+----------+---------------+-----------------+-----------+----------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ID | REPLICA_COUNT | LOCATION_LABELS | AVAILABLE | PROGRESS |
+--------------+------------+----------+---------------+-----------------+-----------+----------+
| test         | t_test     |      111 |             2 |                 |         1 |        1 |
+--------------+------------+----------+---------------+-----------------+-----------+----------+
1 row in set (0.00 sec)

mysql> alter table test.t_test set tiflash replica 1;      

4.4.2. 縮容 TiFlash 節點

4.4.2.1. 通過 TiUP 縮容 TiFlash 節點

  1. 檢視 TiFlash 節點 ID
~]$ tiup cluster display kruidb-cluster

tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster display kruidb-cluster
Cluster type:       tidb
Cluster name:       kruidb-cluster
Cluster version:    v5.4.2
Deploy user:        tidb
SSH type:           builtin
Dashboard URL:      http://192.168.3.222:2379/dashboard
Grafana URL:        http://192.168.3.221:3000
ID                   Role          Host           Ports                            OS/Arch       Status  Data Dir                      Deploy Dir
--                   ----          ----           -----                            -------       ------  --------                      ----------
192.168.3.221:9093   alertmanager  192.168.3.221  9093/9094                        linux/x86_64  Up      /tidb-data/alertmanager-9093  /tidb-deploy/alertmanager-9093
192.168.3.221:3000   grafana       192.168.3.221  3000                             linux/x86_64  Up      -                             /tidb-deploy/grafana-3000
192.168.3.221:2379   pd            192.168.3.221  2379/2380                        linux/x86_64  Up      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.222:2379   pd            192.168.3.222  2379/2380                        linux/x86_64  Up|UI   /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.223:2379   pd            192.168.3.223  2379/2380                        linux/x86_64  Up|L    /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.221:9090   prometheus    192.168.3.221  9090/12020                       linux/x86_64  Up      /tidb-data/prometheus-9090    /tidb-deploy/prometheus-9090
192.168.3.221:4000   tidb          192.168.3.221  4000/10080                       linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.222:4000   tidb          192.168.3.222  4000/10080                       linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.223:4000   tidb          192.168.3.223  4000/10080                       linux/x86_64  Up      -                             /tidb-deploy/tidb-4000
192.168.3.228:9000   tiflash       192.168.3.228  9000/8123/3930/20170/20292/8234  linux/x86_64  Up      /tidb-data/tiflash-9000       /tidb-deploy/tiflash-9000
192.168.3.229:9000   tiflash       192.168.3.229  9000/8123/3930/20170/20292/8234  linux/x86_64  Up      /tidb-data/tiflash-9000       /tidb-deploy/tiflash-9000
192.168.3.224:20160  tikv          192.168.3.224  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.225:20160  tikv          192.168.3.225  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.226:20160  tikv          192.168.3.226  20160/20180                      linux/x86_64  Up      /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
Total nodes: 14      
  1. 執行縮容
~]$ tiup cluster scale-in kruidb-cluster --node 192.168.3.228:9000
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster scale-in kruidb-cluster --node 192.168.3.228:9000
This operation will delete the 192.168.3.228:9000 nodes in `kruidb-cluster` and all their data.
Do you want to continue? [y/N]:(default=N) y
The component `[tiflash]` will become tombstone, maybe exists in several minutes or hours, after that you can use the prune command to clean it
Do you want to continue? [y/N]:(default=N) y
Scale-in nodes...      
  1. 清理叢集

待縮容後的 TiFlash 節點狀态變為 ​

​Tombstone​

​ 時,執行如下語句清理叢集,更新拓撲。

~]$ tiup cluster prune kruidb-cluster      

4.4.2.2. 手動強制縮容 TiFlash 節點

在特殊情況下(比如需要強制下線節點),或者 TiUP 操作失敗的情況下,可以使用以下方法手動下線 TiFlash 節點。

  1. 調整列存副本數
~]$ mysql -uroot -h 192.168.3.221 -P 4000 -proot

mysql> alter table test.t_test set tiflash replica 0;
Query OK, 0 rows affected (0.52 sec)

mysql> SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'test' and TABLE_NAME = 't_test';
Empty set (0.00 sec)      
  1. pd-ctl 檢視 TiFlash 節點的 Store ID
~]$ tiup ctl:v5.4.2 pd -u http://192.168.3.221:2379 store

Starting component `ctl`: /home/tidb/.tiup/components/ctl/v5.4.2/ctl pd -u http://192.168.3.221:2379 store
{
  "count": 4,
  "stores": [
  {
      "store": {
        "id": 5761,                           # 這裡為 TiFlash 的 Store ID
        "address": "192.168.3.229:3930",
        "labels": [
          {
            "key": "engine",
            "value": "tiflash"
          }
        ],
        "version": "v5.4.2",
        "peer_address": "192.168.3.229:20170",
        "status_address": "192.168.3.229:20292",
        "git_hash": "82c1eae6ad21a2367b19029ece53ffce428df165",
        "start_timestamp": 1659013449,
        "deploy_path": "/tidb-deploy/tiflash-9000/bin/tiflash",
        "last_heartbeat": 1659015359358123962,
        "state_name": "Up"
      },
      "status": {
        "capacity": "19.56GiB",
        "available": "17.22GiB",
        "used_size": "29.79KiB",
        "leader_count": 0,
        "leader_weight": 1,
        "leader_score": 0,
        "leader_size": 0,
        "region_count": 0,
        "region_weight": 1,
        "region_score": 6556466030.143202,
        "region_size": 0,
        "slow_score": 0,
        "start_ts": "2022-07-28T21:04:09+08:00",
        "last_heartbeat_ts": "2022-07-28T21:35:59.358123962+08:00",
        "uptime": "31m50.358123962s"
      }
    },
    ......
    ]
}      

也可用如下指令擷取 store ID

v5.4.2]$ pwd
/home/tidb/.tiup/components/ctl/v5.4.2
v5.4.2]$ ./pd-ctl -u http://192.168.3.221:2379 store      
  1. pd-ctl 下線 TiFlash 節點
~]$ tiup ctl:v5.4.2 pd -u http://192.168.3.221:2379 store delete 5761

Starting component `ctl`: /home/tidb/.tiup/components/ctl/v5.4.2/ctl pd -u http://192.168.3.221:2379 store delete 5761
Success!      
  1. 等待該 TiFlash 節點對應的 store 消失或 state_name 變為 Tombstone,再關閉 TiFlash 程序。
~]$ tiup cluster display kruidb-cluster
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.10.2/tiup-cluster display kruidb-cluster
Cluster type:       tidb
Cluster name:       kruidb-cluster
Cluster version:    v5.4.2
Deploy user:        tidb
SSH type:           builtin
Dashboard URL:      http://192.168.3.222:2379/dashboard
Grafana URL:        http://192.168.3.221:3000
ID                   Role          Host           Ports                            OS/Arch       Status     Data Dir                      Deploy Dir
--                   ----          ----           -----                            -------       ------     --------                      ----------
192.168.3.221:9093   alertmanager  192.168.3.221  9093/9094                        linux/x86_64  Up         /tidb-data/alertmanager-9093  /tidb-deploy/alertmanager-9093
192.168.3.221:3000   grafana       192.168.3.221  3000                             linux/x86_64  Up         -                             /tidb-deploy/grafana-3000
192.168.3.221:2379   pd            192.168.3.221  2379/2380                        linux/x86_64  Up         /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.222:2379   pd            192.168.3.222  2379/2380                        linux/x86_64  Up|UI      /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.223:2379   pd            192.168.3.223  2379/2380                        linux/x86_64  Up|L       /tidb-data/pd-2379            /tidb-deploy/pd-2379
192.168.3.221:9090   prometheus    192.168.3.221  9090/12020                       linux/x86_64  Up         /tidb-data/prometheus-9090    /tidb-deploy/prometheus-9090
192.168.3.221:4000   tidb          192.168.3.221  4000/10080                       linux/x86_64  Up         -                             /tidb-deploy/tidb-4000
192.168.3.222:4000   tidb          192.168.3.222  4000/10080                       linux/x86_64  Up         -                             /tidb-deploy/tidb-4000
192.168.3.223:4000   tidb          192.168.3.223  4000/10080                       linux/x86_64  Up         -                             /tidb-deploy/tidb-4000
192.168.3.229:9000   tiflash       192.168.3.229  9000/8123/3930/20170/20292/8234  linux/x86_64  Tombstone  /tidb-data/tiflash-9000       /tidb-deploy/tiflash-9000
192.168.3.224:20160  tikv          192.168.3.224  20160/20180                      linux/x86_64  Up         /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.225:20160  tikv          192.168.3.225  20160/20180                      linux/x86_64  Up         /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
192.168.3.226:20160  tikv          192.168.3.226  20160/20180                      linux/x86_64  Up         /tidb-data/tikv-20160         /tidb-deploy/tikv-20160
Total nodes: 13      
  1. 删除 TiFlash 節點的資料檔案
  2. 手動更新叢集配置檔案,删除已下線的 TiFlash 節點資訊

官方文檔 ​​手動縮容 TiFlash 節點​​​中介紹通過 ​

​tiup cluster edit-config <cluster-name>​

​ 手動删除 TiFlash 相關資訊。但是,經過實踐發現删除 TIFlash 資訊後,無法 wq 儲存退出。最終通過如下方式清理掉 TiFlash 相關資訊。

注意

​~]$ tiup cluster scale-in kruidb-cluster --node 192.168.3.229:9000 --force ​

手動縮容 TiFlash 是為了應對 TiUP 縮容失敗時的備選方案,如果仍然需要通過 ​

​tiup cluster scale-in​

​ 清理掉 TiFlash 資訊,這也失去了手動縮>容的意義。

4.4.3. 清除同步規則

  1. 查詢目前PD執行個體中所有與TiFlash相關的資料同步規則
~]$ curl http://192.168.3.221:2379/pd/api/v1/config/rules/group/tiflash
null      
[
  {
    "group_id": "tiflash",
    "id": "table-45-r",
    "override": true,
    "start_key": "7480000000000000FF2D5F720000000000FA",
    "end_key": "7480000000000000FF2E00000000000000F8",
    "role": "learner",
    "count": 1,
    "label_constraints": [
      {
        "key": "engine",
        "op": "in",
        "values": [
          "tiflash"
        ]
      }
    ]
  }
]      
  1. 删除id為table-45-r的表的同步規則
~]$  curl -v -X DELETE http://192.168.3.221:2379/pd/api/v1/config/rule/tiflash/table-45-r      

繼續閱讀