Ceph 監控中應用 Prometheus relabel 功能

1. 問題描述

工作環境中有三個獨立的 Ceph 叢集，分别負責對象存儲、塊存儲和檔案存儲。搭建這幾個 Ceph 叢集時，我對 Ceph 重命名 Cluster name 的難度沒有足夠的了解，是以使用的都是預設的 cluster name：ceph，不巧的是 Prometheus 的 ceph_exporter 就是用 cluster name 來區分不同叢集，結果是 Grafana 中各個叢集的資料無法區分，所有的叢集資料都繪制在了一個圖示中，非常亂不說，而且部分資料還無法正常顯示。

也許大家會說，那就改 Ceph cluster name 不就好了。問題是 Ceph 修改 Cluster name 沒那麼簡單，ceph 檔案存儲目錄都是和 Cluster name 有對應關系的，是以很多配置檔案和資料都需要修改目錄才能生效，對于已經開始正式使用的 Ceph 叢集，這麼做風險有點大。當然如果給每個 Ceph 叢集單獨搭建一個 Prometheus 和 Grafana 環境的話，問題也能解決，但這種方式顯得太沒技術含量了，不到萬不得已，實在不想采用。

我最開始想到的解決方式是修改 ceph_exporter，既然 cluster name 不行，那加上 Ceph 的 fsid 總能區分出來了吧，就像這樣：

不過 fsid 這個變量很難直覺看出來代表的是哪個 Ceph 叢集，也不是一個好的方案。

最後多虧

neurodrone

，才了解到 Prometheus 的 relabel 功能，可以完美的解決這個問題。

2. relabel 配置

Relabel 的本意其實修改導出 metrics 資訊的 label 字段，可以對 metrics 做過濾，删除某些不必要的 metrics，label 重命名等，而且也支援對 label 的值作出修改。

舉一個例子，三個叢集的 ceph_pool_write_total 的 label cluster 取值都為 ceph。但在 Prometheus 的配置中，他們分别是分屬于不通 job 的，我們可以通過對 job 進行 relabel 來修改 cluster label 的指，來完成區分。

# cluster1's metric
ceph_pool_write_total{cluster="ceph",pool=".rgw.root"} 4

# cluster2's metric
ceph_pool_write_total{cluster="ceph",pool=".rgw.root"} 10

# cluster3's metric
ceph_pool_write_total{cluster="ceph",pool=".rgw.root"} 7

具體的配置如下，cluster label 的值就改為了 ceph*，并且導出到了新 label clusters 中。

scrape_configs:
  - job_name: 'ceph1'
    relabel_configs:
    - source_labels: ["cluster"]
      replacement: "ceph1"
      action: replace
      target_label: "clusters"
    static_configs:
    - targets: ['ceph1:9128']
      labels:
        alias: ceph1

  - job_name: 'ceph2'
    relabel_configs:
    - source_labels: ["cluster"]
      replacement: "ceph2"
      action: replace
      target_label: "clusters"
    static_configs:
    - targets: ['ceph2:9128']
      labels:
        alias: ceph2

  - job_name: 'ceph3'
    relabel_configs:
    - source_labels: ["cluster"]
      replacement: "ceph3"
      action: replace
      target_label: "clusters"
    static_configs:
    - targets: ['ceph3:9128']
      labels:
        alias: ceph3

修改後的 metric 資訊變成這個樣子，這樣我們就可以區分出不同的 Ceph 叢集的資料了。

# cluster1's metric
ceph_pool_write_total{clusters="ceph1",pool=".rgw.root"} 4

# cluster2's metric
ceph_pool_write_total{clusters="ceph2",pool=".rgw.root"} 10

# cluster3's metric
ceph_pool_write_total{clusters="ceph3",pool=".rgw.root"} 7

3. Grafana dashboard 調整

光是修改 Prometheus 的配置還不夠，畢竟我們還要在界面上能展現出來，Grafana 的 dashboard 也要做對應的修改，本文使用的 dashboard 是

Ceph - Cluster

。

首先是要 dashboard 添加 clusters 變量，在界面上操作即可。

先點選 dashboard 的 "settings" 按鈕（顯示齒輪圖示的就是）

如下圖所示添加 clusters variable，最後儲存。

我們已經可以在 dashboard 上看到新加的 variable 了：

接下來每個圖表的查詢語句也要做對應的修改：

最終改好的 dashboard json 檔案可從如下連結下載下傳到：

ceph-cluster.json

Ceph 監控中應用 Prometheus relabel 功能

1. 問題描述

2. relabel 配置

3. Grafana dashboard 調整

4. 參考文檔

繼續閱讀

json擷取資料的三種方式

JSON的三種解析方式JSON的三種解析方式一、什麼是JSON？二、JSON解析之傳統的JSON解析三、JSON解析之GSON三、JSON解析之FastJSON總結：

Json 的三種解析方式Json簡介Json的三種解析方式

JSON三種建立方式

SpringMVC 傳回json的兩種方式

json傳輸資料解決中文亂碼問題

關于 underscore 中模闆引擎的應用示範樣例

underscore 模闆标簽修改。

Ajax——模闆引擎

使用underscore的template自定義模闆

underscore模闆功能的使用和學習

[HTML5]自定義屬性 data-* 和 jQuery.data 詳解

七牛雲-C#SDK-上傳-前期準備

vue-cli簡介（中文翻譯）

Ajax發送和擷取json資料到Spring mvc 1.spring mvc後端2.web前段

JSONObject包導入異常 java.lang.NoClassDefFoundErrorweb項目的導入包的問題