使用Bulk API 實作批量操作
Bluk的格式:
{
action:{
metadata
}
{
requestbody
}
}
action(行為):
create
: 文檔不存在時,建立
update
:更新文檔
index
:建立新文檔,或者替換已經有的文檔
delete
删除一個文檔
metadata:
_index,_type,_id
create 和 index 的差別
如果資料存在,使用
create
操作失敗,會提示文檔已經存在,使用
index
就可以成功執行
示例
{
"delete":{
"_index:"lib",
"_type":"user",
"_id":"1"
}
}
批量添加
POST lib2/books/_bulk
{"index":{"_id":1}}
{"title":"java","price":55}
{"index":{"_id":2}}
{"title":"html5","price":15}
{"index":{"_id":3}}
{"title":"Python","price":30}
必須要在一行中書寫,否則報錯
查詢是否存在資料
GET lib2/books/_mget
{
"ids":[1,2,3]
}
批量删除和更新一起
{
"took": 3,
"errors": false,
"items": [
{
"delete": {
"_index": "lib2",
"_type": "books",
"_id": "1",
"_version": 1,
"result": "not_found",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 2,
"_primary_term": 1,
"status": 404
}
},
{
"delete": {
"_index": "lib2",
"_type": "books",
"_id": "2",
"_version": 1,
"result": "not_found",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 2,
"_primary_term": 1,
"status": 404
}
},
{
"update": {
"_index": "lib2",
"_type": "books",
"_id": "3",
"_version": 2,
"result": "noop",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"status": 200
}
}
]
}
查詢結果
GET lib2/books/_mget
{
"ids":[1,2,3]
}
{
"docs": [
{
"_index": "lib2",
"_type": "books",
"_id": "1",
"found": false
},
{
"_index": "lib2",
"_type": "books",
"_id": "2",
"found": false
},
{
"_index": "lib2",
"_type": "books",
"_id": "3",
"_version": 2,
"found": true,
"_source": {
"title": "Python",
"price": 58,
"type": "cheng xu she ji"
}
}
]
}
** 說明已經批量删除了 和 更新了**
使用 index
和 create
來批量添加
index
create
POST /lib2/books/_bulk
{"create":{"_index":"tt","_type":"ttt","_id":100}}
{"name":"zhangsan"}
{"index":{"_index":"tt","_type":"ttt"}}
{"name":"lisi"}
傳回結果
#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
{
"took": 92,
"errors": false,
"items": [
{
"create": {
"_index": "tt",
"_type": "ttt",
"_id": "100",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1,
"status": 201
}
},
{
"index": {
"_index": "tt",
"_type": "ttt",
"_id": "oORklGsBI3jKMDQnH__Y",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1,
"status": 201
}
}
]
}
确認是否添加進去
GET tt/ttt/_mget
{
"ids":[100,"oORklGsBI3jKMDQnH__Y"]
}
傳回結果
{
"docs": [
{
"_index": "tt",
"_type": "ttt",
"_id": "100",
"_version": 1,
"found": true,
"_source": {
"name": "zhangsan"
}
},
{
"_index": "tt",
"_type": "ttt",
"_id": "oORklGsBI3jKMDQnH__Y",
"_version": 1,
"found": true,
"_source": {
"name": "lisi"
}
}
]
}
BLUK 最多一次處理多少資料量
bulk 會把将要處理的資料載入到記憶體中,是以資料量是有限制的,最佳的資料量不是一個确定的數值,它 取決于你的硬體,他的文檔大小以及複雜性,你的索引以及搜尋的負載。
一般建議是 1000 - 5000 ,大小建議是 5 - 15 MB, 預設不能超過 100 MB, 可以在es的配置檔案中($ES_HOME下的config 下的 elasticsearch.yml )中 。
Kibana 控制台
POST lib2/book/_bulk
{"index":{"_id":1}}
{"title":"java","price":55}
{"index":{"_id":2}}
{"title":"html5","price":15}
{"index":{"_id":3}}
{"title":"Python","price":30}
GET lib2/books/_mget
{
"ids":[1,2,3]
}
POST /lib2/books/_bulk
{"delete":{"_index":"lib2","_type":"books","_id":1}}
{"delete":{"_index":"lib2","_type":"books","_id":2}}
{"update":{"_index":"lib2","_type":"books","_id":3}}
{"doc":{"price":58,"type":"cheng xu she ji"}}
POST /lib2/books/_bulk
{"create":{"_index":"tt","_type":"ttt","_id":100}}
{"name":"zhangsan"}
{"index":{"_index":"tt","_type":"ttt"}}
{"name":"lisi"}
GET tt/ttt/_mget
{
"ids":[100,"oORklGsBI3jKMDQnH__Y"]
}