
創作人:楊景江
審稿人:朱永生
彙總作業( rollup jobs )是周期性執行的任務,通過彙總作業,可以将某些索引中的資料進行周期性自定義化聚合,然後将聚合後的資料寫入到新的索引中,整個流程叫做 Rollup 。
使用場景:
彙總曆史資料:
由于曆史資料資料量大,占用磁盤成本高,相關業務方隻關心近期幾天的原始資料,曆史資料不關心原始資料,隻關心固定名額統計。為了節省成本,就可以通過 Rollup 操作将曆史資料進行彙總,寫入到新的索引,之後将曆史索引删除( ILM 功能),進而節省大量成本
轉換最佳時間:
由于資料量或機器硬體等原因,導緻實時聚合查詢耗時較長,可以通過在夜間或者準實時進行 Rollup 操作,将前一天索引或者幾分鐘前的資料進行彙總,寫入到新索引(将毫秒級别資料彙總,轉換為秒級甚至分鐘級别),使用者查詢 Rollup 後新索引的資料,進而提升查詢效率。
彙總曆史資料功能限制:
彙總功能隻允許使用以下聚合方式對字段進行分組
- Date Histogram aggregation
- Histogram aggregation
- Terms aggregation (使用較多)
數字字段隻可以進行如下名額聚合
- Min aggregation
- Max aggregation
- Sum aggregation
- Average aggregation
- Value Count aggregation
每個功能都要結合具體業務場景來使用,切忌為了使用功能而設計。
API 介紹
此處以 Elasticsearch 慢查原始資料統計功能為例進行介紹(敏感資訊已經替換)
資料準備
索引 mapping 結構:
PUT es-slowlog-2021-04-21
{
"mappings": {
"_field_names": {
"enabled": false
},
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"ignore_above": 512,
"type": "keyword"
}
}
}
],
"properties": {
"@timestamp": {
"type": "date"
},
"cluster": {
"type": "keyword",
"ignore_above": 512
},
"host": {
"properties": {
"name": {
"type": "keyword",
"ignore_above": 512
}
}
},
"elasticsearch": {
"properties": {
"index": {
"properties": {
"name": {
"type": "keyword",
"ignore_above": 512
}
}
}
}
},
"timestamp_local": {
"type": "date"
}
}
}
}
單條資料 demo 樣例(與上邊的 mapping 對應):
POST es-slowlog-2021-04-21/_doc
{
"cluster": "clustername-demo",
"offset": 0,
"log": {
"level": "WARN"
},
"prospector": {
"type": "log"
},
"source": "/home/elasticsearch/clustername-demo_index_search_slowlog.log",
"message": "[2021-04-21T14:03:06,896][WARN ][i.s.s.query ] [host_name-demo] [basiclog-slowlog_2021-04-02][2] took[2.3s], took_millis[2307], total_hits[23129 hits], types[], stats[], search_type[QUERY_THEN_FETCH], total_shards[4], source[{\"size\":0,\"query\":{\"bool\":{\"filter\":[{\"match_all\":{\"boost\":1.0}},{\"match_phrase\":{\"logtype.keyword\":{\"query\":\"server\",\"slop\":0,\"zero_terms_query\":\"NONE\",\"boost\":1.0}}},{\"range\":{\"@timestamp\":{\"from\":\"2021-04-02T15:48:04.138Z\",\"to\":\"2021-04-02T16:03:04.138Z\",\"include_lower\":true,\"include_upper\":true,\"format\":\"strict_date_optional_time\",\"boost\":1.0}}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"_source\":{\"includes\":[],\"excludes\":[]},\"stored_fields\":\"*\",\"docvalue_fields\":[{\"field\":\"@timestamp\",\"format\":\"date_time\"},{\"field\":\"time\",\"format\":\"date_time\"}],\"script_fields\":{},\"track_total_hits\":2147483647,\"aggregations\":{\"2\":{\"terms\":{\"field\":\"cluster.keyword\",\"size\":20,\"min_doc_count\":1,\"shard_min_doc_count\":0,\"show_term_doc_count_error\":false,\"order\":[{\"_count\":\"desc\"},{\"_key\":\"asc\"}]}}}}], id[],",
"input": {
"type": "log"
},
"logtype": "slowlog",
"log_type": "basic-slowlog",
"timestamp_local": "2021-04-21T14:03:06.896+08:00",
"@timestamp": "2021-04-21T14:03:06.896Z",
"elasticsearch": {
"node": {
"name": "host_name-demo"
},
"slowlog": {
"took": "2.3s",
"logger": "i.s.s.query "
},
"index": {
"name": "basiclog-slowlog_2021-04-02"
},
"shard": {
"id": "2"
}
},
"host": {
"name": "host_name-demo"
},
"beat": {
"hostname": "beathostname-demo",
"name": "beathostname-demo",
"version": "6.5.4"
},
"@version": "1",
"event": {
"duration": 2307000000,
"created": "2021-04-21T06:59:11.934Z",
"kind": "event",
"category": "database",
"type": "info"
}
}
在 Kibana 中配置 Index Patterns
注:最新版本 API 請參考官方文檔: https://www.elastic.co/guide/en/elasticsearch/reference/master/xpack-rollup.html
基礎 API
建立彙總任務:
請求:PUT _rollup/job/<job_id>
參數 | 必選 | 類型 | 說明 |
---|---|---|---|
index_pattern | 是 | string | 索引pattern名稱 |
rollup_index | 目标索引,部分版本限制索引名以rollup開頭 | ||
cron | 定時任務執行周期,與彙總資料的時間間隔無關。 | ||
page_size | integer | 彙總索引每次疊代中處理的存儲桶的結果數。值越大,執行越快,但是處理過程中需要更多的記憶體。 | |
groups | object | 為彙總作業定義日期直方圖聚合 | |
-date_histogram | 定義 日期直方圖聚合 | ||
--calendar_interval | 時間桶大小,1m 代表一分鐘一個桶 | ||
--field | 聚合依據的時間字段 | ||
--time_zone | 否 | 時區,default:UTC | |
--delay | time units | 彙總延時,多久之前的資料可以進行彙總,因為部分資料寫入可能會有延時,彙總任務前要将資料全部寫入并且可查詢 | |
-terms | 分組的字段屬性 | ||
--fields | 定義terms字段集。此數組字段可以是keyword也可以是numerics類型,無順序要求。 | ||
-histogram | 直方圖組将一個或多個數字字段聚合為數字直方圖間隔 | ||
array | 建構直方圖的字段,必須是數字 | ||
--interval | 彙總時要生成的直方圖存儲桶的間隔 | ||
metrics | 定義彙總資料的方式 | ||
-field | 定義需要采集的名額的字段。例如以上示例是分别對,進行采集。 | ||
-metrics | 定義聚合算子。設定為sum,表示對某個名額進行sum運算。僅支援min、max、sum、avg、value_count。 | ||
timeout | 請求逾時時間 |
PUT _rollup/job/es-slowlog-agg-id
{
"index_pattern": "es-slowlog*", //索引pattern名稱
"rollup_index": "rollup-es-slowlog-agg", //目标索引,rollup-開頭必須明确指定
"cron": "0 * * * * ?", //定時任務執行周期,與彙總資料的時間間隔無關。
"groups": {
"date_histogram": { //定義 日期直方圖聚合
"calendar_interval": "1m", // 時間桶大小,一分鐘一個桶
"field": "timestamp_local", //聚合的時間字段
"delay": "1m", //彙總延時,多久之前的資料可以進行彙總,因為部分資料寫入可能會有延時,彙總任務前要将資料全部寫入并且可查詢
"time_zone": "UTC" // 時區 eg: GMT+8
},
"terms": {
"fields": [ //彙總字段
"cluster", // 叢集的名稱
"elasticsearch.index.name", //索引名稱
"host.name" //主機名
]
}
},
"metrics": [], //預設是count數,可以指定min、max、sum、average、value count
"timeout": "20s", // 逾時時間
"page_size": 10000 // 單頁數量,較大的值會更快地彙總,但也會耗費更多記憶體
}
查詢所有彙總任務:
GET _rollup/job/*
擷取單個彙總任務詳情:
請求:GET _rollup/job/<job_id>
GET _rollup/job/es-slowlog-agg-id
{
"jobs": [
{
"config": {
"id": "es-slowlog-agg-id",
"index_pattern": "es-slowlog*",
"rollup_index": "rollup-es-slowlog-agg",
"cron": "0 * * * * ?",
"groups": {
"date_histogram": {
"calendar_interval": "1m",
"field": "timestamp_local",
"delay": "1m",
"time_zone": "UTC"
},
"terms": {
"fields": [
"cluster",
"elasticsearch.index.name",
"host.name"
]
}
},
"metrics": [
],
"timeout": "20s",
"page_size": 10000
},
"status": {
"job_state": "stopped",
"upgraded_doc_id": true
},
"stats": {
"pages_processed": 0,
"documents_processed": 0,
"rollups_indexed": 0,
"trigger_count": 0,
"index_time_in_ms": 0,
"index_total": 0,
"index_failures": 0,
"search_time_in_ms": 0,
"search_total": 0,
"search_failures": 0,
"processing_time_in_ms": 0,
"processing_total": 0
}
}
]
}
開始彙總任務:
請求:POST _rollup/job/<job_id>/_start
POST _rollup/job/es-slowlog-agg-id/_start
//執行後擷取目前任務狀态,關注下status、stat,status中
GET _rollup/job/es-slowlog-agg-id
{
"jobs": [
{
"config": {
"id": "es-slowlog-agg-id",
"index_pattern": "es-slowlog*",
"rollup_index": "rollup-es-slowlog-agg",
"cron": "0 * * * * ?",
"groups": {
"date_histogram": {
"calendar_interval": "1m",
"field": "timestamp_local",
"delay": "1m",
"time_zone": "UTC"
},
"terms": {
"fields": [
"cluster",
"elasticsearch.index.name",
"host.name"
]
}
},
"metrics": [
],
"timeout": "20s",
"page_size": 10000
},
"status": {
"job_state": "started", //如果停止的任務,此處顯示stopped
"current_position": { //目前rollup任務執行的位置,及term結果
"cluster.terms": "clustername-demo",
"elasticsearch.index.name.terms": "basiclog-slowlog_2021-04-02",
"host.name.terms": "host_name-demo",
"timestamp_local.date_histogram": 1618984980000
},
"upgraded_doc_id": true
},
"stats": {//執行狀态
"pages_processed": 2,
"documents_processed": 1,
"rollups_indexed": 1,
"trigger_count": 1,
"index_time_in_ms": 103,
"index_total": 1,
"index_failures": 0,
"search_time_in_ms": 6,
"search_total": 2,
"search_failures": 0,
"processing_time_in_ms": 0,
"processing_total": 2
}
}
]
}
status.job_state 描述:
stopped
表示任務已暫停。
started
表示任務正在運作,但沒有主動彙總資料。當 cron 間隔觸發時,作業的任務将開始處理資料。
indexing
意味着正在處理資料并建立新的彙總文檔。在此狀态下,任何後續的 cron 間隔觸發器都将被忽略,因為該作業已經與先前的觸發器一起處于活動狀态。
abort
是一種瞬态,通常使用者不會看到。如果由于某種原因需要關閉任務(已删除作業,遇到不可恢複的錯誤等)。abort 狀态後不久,作業将自己從群集中删除。
停止彙總任務:
請求:POST _rollup/job/<job_id>/_stop
POST _rollup/job/es-slowlog-agg-id/_stop
删除彙總任務:
請求:DELETE _rollup/job/<job_id>
删除操作需謹慎
DELETE /_rollup/job/es-slowlog-agg-id
_rollup_search 查詢
因為在原始文檔和彙總文檔中使用的文檔結構不同。 Rollup 搜尋會将标準查詢 DSL 重寫為與彙總文檔相同的結構,然後擷取響應并将其重寫回用戶端。
使用方式:
GET
**<target>**
/_rollup_search
<target>
參數規則(必需,字元串):
- 必須指定索引或通配符表達式。
- 可以指定多個非彙總索引。
- 隻能指定一個彙總索引。如果提供多個,則會發生異常。
- 可以使用通配符表達式,但是,如果它們比對多個彙總索引,則會發生異常。
eg: es-slowlog*,rollup-es-slowlog-agg1/_rollup_search。
請求體支援正常 Search API 的功能的子集。它支援:
-
用于指定 DSL 查詢的參數,但受一些限制query
請參閱 彙總搜尋限制 : https://www.elastic.co/guide/en/elasticsearch/reference/7.x/rollup-search-limitations.html 彙總聚合限制 https://www.elastic.co/guide/en/elasticsearch/reference/7.x/rollup-agg-limitations.html
-
用于指定聚合的參數aggregations
不可用的功能:
-
:無法擷取原始資料,如果想擷取原始資料,請使用 _search 查詢彙總索引。size
-
,highlighter
suggestors
post_filter
profile
:不允許使用。explain
原始資料和彙總索引同時查詢實作原理:
Elasticsearch 接收到原始資料和彙總資料聯合 _rollup_search 查詢響應後, 會重寫彙總響應,并将兩者合并在一起。在合并過程中,如果兩個響應之間的存儲桶中有任何重疊,則使用非彙總索引中彙總的桶資料。
樣例:
建立新的複雜任務,具體任務資訊如下
//建立複雜任務,彙總多個名額,任務詳情如下
{
"config": {
"id": "es-slowlog-agg-id1",
"index_pattern": "es-slowlog*",
"rollup_index": "rollup-es-slowlog-agg1",
"cron": "0 * * * * ?",
"groups": {
"date_histogram": {
"calendar_interval": "1m",
"field": "timestamp_local",
"delay": "1m",
"time_zone": "UTC"
},
"histogram": {
"interval": 8,
"fields": [
"event.duration"
]
},
"terms": {
"fields": [
"cluster",
"elasticsearch.index.name",
"host.name"
]
}
},
"metrics": [
{
"field": "event.duration",
"metrics": [
"avg",
"max",
"min",
"sum",
"value_count"
]
}
],
"timeout": "20s",
"page_size": 10000
},
"status": {
"job_state": "started",
"current_position": {
"cluster.terms": "clustername-demo",
"elasticsearch.index.name.terms": "basiclog-slowlog_2021-04-02",
"event.duration.histogram": 2307000000,
"host.name.terms": "host_name-demo",
"timestamp_local.date_histogram": 1618984980000
},
"upgraded_doc_id": true
},
"stats": {
"pages_processed": 6,
"documents_processed": 1,
"rollups_indexed": 1,
"trigger_count": 5,
"index_time_in_ms": 115,
"index_total": 1,
"index_failures": 0,
"search_time_in_ms": 21,
"search_total": 6,
"search_failures": 0,
"processing_time_in_ms": 0,
"processing_total": 6
}
}
_search 查詢彙總目标索引中的原始資料:
GET rollup-es-slowlog-agg1/_search
{
"size":10,
"query": {
"bool": {
"must": [],
"filter": [
{
"match_all": {}
}
],
"should": [],
"must_not": []
}
}
}
傳回結果
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "rollup-es-slowlog-agg1",
"_type": "_doc",
"_id": "es-slowlog-agg-id1$5uzfGmyS2uAb3XRznkZBgA",
"_score": 1,
"_source": {
"cluster.terms.value": "bj-ali-xueyan-oa-es-cluster",
"event.duration.avg._count": 1,
"event.duration.max.value": 2377000000,
"event.duration.histogram.value": 2377000000,
"timestamp_local.date_histogram.time_zone": "UTC",
"elasticsearch.index.name.terms.value": "basiclog-slowlog_2400-2021-04-02",
"host.name.terms._count": 1,
"cluster.terms._count": 1,
"host.name.terms.value": "bj-sjhl-university-es-online-99-62",
"event.duration.avg.value": 2377000000,
"elasticsearch.index.name.terms._count": 1,
"event.duration.histogram.interval": 8,
"timestamp_local.date_histogram._count": 1,
"timestamp_local.date_histogram.timestamp": 1618995780000,
"_rollup.version": 2,
"event.duration.histogram._count": 1,
"timestamp_local.date_histogram.interval": "1m",
"event.duration.sum.value": 2377000000,
"event.duration.min.value": 2377000000,
"event.duration.value_count.value": 1,
"_rollup.id": "es-slowlog-agg-id1"
}
}
]
}
}
_rollup_search 查詢資料(可以把原始資料和彙總資料聯合查詢)
GET es-slowlog*,rollup-es-slowlog-agg1/_rollup_search
{
"size": 0,
"aggregations": {
"avg_event.duration": {
"avg": {
"field": "event.duration"
}
}
}
}
//傳回值
{
"took": 740,
"timed_out": false,
"terminated_early": false,
"num_reduce_phases": 2,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": 0,
"hits": [
]
},
"aggregations": {
"avg_event.duration": {
"value": 2311777445.714286
}
}
}
擷取彙總資訊
根據 Rollup 配置中的 index_pattern 擷取對應的任務,支援 _all 查詢所有
請求:GET _rollup/data/
//查詢所有
GET _rollup/data/_all
//查詢指定目标
GET _rollup/data/es-slowlog*
{
"es-slowlog*": {
"rollup_jobs": [
{
"job_id": "es-slowlog-agg-id",
"rollup_index": "rollup-es-slowlog-agg",
"index_pattern": "es-slowlog*",
"fields": {
"cluster": [
{
"agg": "terms"
}
],
"timestamp_local": [
{
"agg": "date_histogram",
"delay": "1m",
"time_zone": "UTC",
"calendar_interval": "1m"
}
],
"elasticsearch.index.name": [
{
"agg": "terms"
}
],
"host.name": [
{
"agg": "terms"
}
]
}
},
{
"job_id": "es-slowlog-agg-id1",
"rollup_index": "rollup-es-slowlog-agg",
"index_pattern": "es-slowlog*",
"fields": {
"cluster": [
{
"agg": "terms"
}
],
"timestamp_local": [
{
"agg": "date_histogram",
"delay": "1m",
"time_zone": "UTC",
"calendar_interval": "1m"
}
],
"elasticsearch.index.name": [
{
"agg": "terms"
}
],
"host.name": [
{
"agg": "terms"
}
]
}
},
{
"job_id": "es-slowlog-agg-id1",
"rollup_index": "rollup-es-slowlog-agg1",
"index_pattern": "es-slowlog*",
"fields": {
"event.duration": [
{
"agg": "histogram",
"interval": 8
},
{
"agg": "avg"
},
{
"agg": "max"
},
{
"agg": "min"
},
{
"agg": "sum"
},
{
"agg": "value_count"
}
],
"cluster": [
{
"agg": "terms"
}
],
"timestamp_local": [
{
"agg": "date_histogram",
"delay": "1m",
"time_zone": "UTC",
"calendar_interval": "1m"
}
],
"elasticsearch.index.name": [
{
"agg": "terms"
}
],
"host.name": [
{
"agg": "terms"
}
]
}
},
{
"job_id": "es-slowlog-agg-id3",
"rollup_index": "rollupes-slowlog-agg",
"index_pattern": "es-slowlog*",
"fields": {
"cluster": [
{
"agg": "terms"
}
],
"timestamp_local": [
{
"agg": "date_histogram",
"delay": "1m",
"time_zone": "UTC",
"calendar_interval": "1m"
}
],
"elasticsearch.index.name": [
{
"agg": "terms"
}
],
"host.name": [
{
"agg": "terms"
}
]
}
}
]
}
}
根據 Rollup 目标索引查詢對應的任務,支援 * 比對
請求:GET /_rollup/data
GET rollupes-slowlog-*/_rollup/data
GET rollupes-slowlog-agg/_rollup/data
{
"rollupes-slowlog-agg": {
"rollup_jobs": [
{
"job_id": "es-slowlog-agg-id3",
"rollup_index": "rollupes-slowlog-agg",
"index_pattern": "es-slowlog*",
"fields": {
"cluster": [
{
"agg": "terms"
}
],
"timestamp_local": [
{
"agg": "date_histogram",
"delay": "1m",
"time_zone": "UTC",
"calendar_interval": "1m"
}
],
"elasticsearch.index.name": [
{
"agg": "terms"
}
],
"host.name": [
{
"agg": "terms"
}
]
}
}
]
}
}
Kibana 使用介紹
對 API 有了一定了解之後,再來通過 Kibana 建立對應 Elasticsearch 叢集的慢查統計就比較簡單了
Kibana 使用中文的部分功能有 bug(例如 Rollup 選擇名額時,會出現異常的情況),建議 Kibana 語言選擇英文
填寫 Logistics
選擇 Date histogram(必填)
選擇 Terms ,此處選擇叢集名稱、索引名稱、節點名稱(選填)
根據需求選擇 Histogram(選填),本次樣例中的 Elasticsearch 慢查 Rollup 隻需要統計 Count 數,此處不需要選擇,直接下一步
根據需求填寫 Metrics(選填),本次樣例中的 Elasticsearch 慢查 Rollup 隻需要統計 Count 數,此處不需要選擇,直接下一步
操作完成,儲存
檢視狀态
配置 Index Pattern 注意選擇的是 Rollup index pattern,圖表配置和普通沒有差別
創作人簡介:
楊景江,關注研究中間件,比如 ES,Redis,RocketMQ 等技術領域。
部落格:
https://blog.csdn.net/xiaoyanghapi/article/month/2016/08