點選上方 程式設計牧馬人,選擇 設為星标
優質項目,及時送達
Elasticsearch 入門教程
本文根據官方文檔[1]指南,基于
docker
容器快速搭建
Elasticsearch
環境,并結合阮一峰部落格[2]全文搜尋引擎 Elasticsearch 入門教程 對
Elasticsearch
快速入門進行總結。
強烈建議閱讀本文前先學習阮一峰部落格,連結如下:全文搜尋引擎 Elasticsearch 入門教程
安裝
官網安裝教程位址:
https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html
基本概念
1. Node 與 Cluster
Elastic
本質上是一個分布式資料庫,允許多台伺服器協同工作,每台伺服器可以運作多個
Elastic
執行個體,單個
Elastic
執行個體稱為一個節點(
node
),一組節點構成一個叢集(
cluster
)。
2. Index
Elastic
會索引所有字段,經過處理後寫入一個反向索引(
Inverted Index
)。查找資料的時候,直接查找該索引。是以,
Elastic
資料管理的頂層機關就叫做
Index
(索引)。它是單個資料庫的同義詞。每個
Index
(即資料庫)的名字必須是小寫。
下面的指令可以檢視目前節點的所有
Index
。
$ curl -X GET 'http://localhost:9200/_cat/indices?v'
3. 添加單個資料
POST logs-my_app-default/_doc
{
"@timestamp": "2099-05-06T16:21:15.000Z",
"event": {
"original": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736"
}
}
結果:
{
"_index": ".ds-logs-my_app-default-2099-05-06-000001",
"_type": "_doc",
"_id": "gl5MJXMBMk1dGnErnBW8",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
4. 添加多個資料
PUT logs-my_app-default/_bulk
{ "create": { } }
{ "@timestamp": "2099-05-07T16:24:32.000Z", "event": { "original": "192.0.2.242 - - [07/May/2020:16:24:32 -0500] \"GET /images/hm_nbg.jpg HTTP/1.0\" 304 0" } }
{ "create": { } }
{ "@timestamp": "2099-05-08T16:25:42.000Z", "event": { "original": "192.0.2.255 - - [08/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638" } }
5. 搜尋資料
查詢所有比對資料:
logs-my_app-default
,并以
@timestamp
降序顯示
GET logs-my_app-default/_search
{
"query": {
"match_all": { }
},
"sort": [
{
"@timestamp": "desc"
}
]
}
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": ".ds-logs-my_app-default-2099-05-06-000001",
"_type": "_doc",
"_id": "PdjWongB9KPnaVm2IyaL",
"_score": null,
"_source": {
"@timestamp": "2099-05-08T16:25:42.000Z",
"event": {
"original": "192.0.2.255 - - [08/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638"
}
},
"sort": [
4081940742000
]
},
...
]
}
}
6. 解析固定字段,去除一些字段:
GET logs-my_app-default/_search
{
"query": {
"match_all": { }
},
"fields": [
"@timestamp"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
}
]
}
{
...
"hits": {
...
"hits": [
{
"_index": ".ds-logs-my_app-default-2099-05-06-000001",
"_type": "_doc",
"_id": "PdjWongB9KPnaVm2IyaL",
"_score": null,
"fields": {
"@timestamp": [
"2099-05-08T16:25:42.000Z"
]
},
"sort": [
4081940742000
]
},
...
]
}
}
"fields"
挑選字段解析,
'_source':false,
該字段不再顯示
7. 範圍搜尋 range
range
GET logs-my_app-default/_search
{
"query": {
"range": {
"@timestamp": {
"gte": "2099-05-05",
"lt": "2099-05-08"
}
}
},
"fields": [
"@timestamp"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
}
]
}
查詢過去一天的資料
GET logs-my_app-default/_search
{
"query": {
"range": {
"@timestamp": {
"gte": "now-1d/d",
"lt": "now/d"
}
}
},
"fields": [
"@timestamp"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
}
]
}
8. 建立 索引 Index
Index
PUT my_index
{
"mappings":
{
"properties":
{
"address":
{
"type": "ip"
},
"port":
{
"type": "long"
}
}
}
}
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "my_index"
}
9. 将一些文檔加載到其中:
POST my_index/_bulk
{"index":{"_id":"1"}}
{"address":"1.2.3.4","port":"80"}
{"index":{"_id":"2"}}
{"address":"1.2.3.4","port":"8080"}
{"index":{"_id":"3"}}
{"address":"2.4.8.16","port":"80"}
傳回結果:
{
"took" : 8,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "my_index",
"_type" : "_doc",
"_id" : "3",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 201
}
}
]
}
10. 使用靜态字元串建立兩個
GET my_index/_search
{
"runtime_mappings": {
"socket": {
"type": "keyword",
"script": {
"source": "emit(doc['address'].value + ':' + doc['port'].value)"
}
}
},
"fields": [
"socket"
],
"query": {
"match": {
"socket": "1.2.3.4:8080"
}
}
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"address" : "1.2.3.4",
"port" : "8080"
},
"fields" : {
"socket" : [
"1.2.3.4:8080"
]
}
}
]
}
}
上面代碼中,傳回結果的
took
字段表示該操作的耗時(機關為毫秒),
timed_out
字段表示是否逾時,
hits
字段表示命中的記錄,裡面子字段的含義如下:
-
:傳回記錄數,本例是2條。total
-
:最高的比對程度,本例是1.0。max_score
-
:傳回的記錄組成的數組。hits
傳回的資料中,
found
字段表示查詢成功,
_source
字段傳回原始記錄。
我們在
runtime_mappings
部分中定義了字段
socket
。 我們使用了一個簡短的
painless script
,該腳本定義了每個文檔将如何計算
socket
的值(使用 + 表示
address
字段的值與靜态字元串 “:” 和
port
字段的值的串聯)。 然後,我們在查詢中使用了字段
socket
。 字段
socket
是一個臨時運作時字段,僅對于該查詢存在,并且在運作查詢時進行計算。 在定義要與
runtime fields
一起使用的
painless script
時,必須包括
emit
以傳回計算出的值。
socket
:運作時加入的字段。
source
,
id
★
官方文檔:
”
The script itself, which you specify as source for an inline script or id for a stored script. Use the stored script APIs to create and manage stored scripts.
"source": "emit(doc['address'].value + ':' + doc['port'].value)" 為内嵌腳本
11. 如果我們發現 socket
是一個我們想在多個查詢中使用的字段,而不必為每個查詢定義它,則可以通過調用簡單地将其添加到映射中:
socket
PUT my_index/_mapping
{
"runtime": {
"socket": {
"type": "keyword",
"script": {
"source": "emit(doc['address'].value + ':' + doc['port'].value)"
}
}
}
}
結果:
{
"acknowledged" : true
}
此時在
Index mapping
檔案裡已經存在
socket
字段,然後查詢,不必在運作時定義包含
socket
字段,例如
GET my_index/_search
{
"fields": [
"socket"
],
"query": {
"match": {
"socket": "1.2.3.4:8080"
}
}
}
結果(和使用靜态字元串建立兩個結果一樣):
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"address" : "1.2.3.4",
"port" : "8080"
},
"fields" : {
"socket" : [
"1.2.3.4:8080"
]
}
}
]
}
}
僅在要顯示
socket
字段的值時才需要語句
"fields": ["socket"]
。 現在,字段查詢可用于任何查詢,但它不存在于索引中,并且不會增加索引的大小。 僅在查詢需要
socket
以及需要它的文檔時才計算
socket
12. runtime
和 runtime_mapping
差別:
runtime
runtime_mapping
使用
runtime
時定義的字段會存儲到
Index
映射中,而
runtime_mapping
定義的字段隻存在運作查詢中。
映射字段:
https://www.elastic.co/guide/en/elasticsearch/reference/7.11/runtime-mapping-fields.html#runtime-mapping-fields
請求字段:
https://www.elastic.co/guide/en/elasticsearch/reference/7.11/runtime-search-request.html#runtime-search-request
13. 在查詢時覆寫字段值
PUT my_raw_index
{
"mappings": {
"properties": {
"raw_message": {
"type": "keyword"
},
"address": {
"type": "ip"
}
}
}
}
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "my_raw_index"
}
參考資料
[1]
官方文檔: https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html
[2]