天天看点

ElasticSearch 使用Bulk API 实现批量操作

使用Bulk API 实现批量操作

Bluk的格式:

{
	action:{
		metadata
	}
	{
		requestbody
	}
}
           

action(行为):

create

: 文档不存在时,创建

update

:更新文档

index

:创建新文档,或者替换已经有的文档

delete

删除一个文档

metadata:

_index,_type,_id

create 和 index 的区别

如果数据存在,使用

create

操作失败,会提示文档已经存在,使用

index

就可以成功执行

示例

{
	"delete":{
		"_index:"lib",
		"_type":"user",
		"_id":"1"
	}
}
           
批量添加
POST lib2/books/_bulk
{"index":{"_id":1}}
{"title":"java","price":55}
{"index":{"_id":2}}
{"title":"html5","price":15}
{"index":{"_id":3}}
{"title":"Python","price":30}
           

必须要在一行中书写,否则报错

查询是否存在数据
GET lib2/books/_mget
{
  "ids":[1,2,3]
}
           
批量删除和更新一起
{
  "took": 3,
  "errors": false,
  "items": [
    {
      "delete": {
        "_index": "lib2",
        "_type": "books",
        "_id": "1",
        "_version": 1,
        "result": "not_found",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 2,
        "_primary_term": 1,
        "status": 404
      }
    },
    {
      "delete": {
        "_index": "lib2",
        "_type": "books",
        "_id": "2",
        "_version": 1,
        "result": "not_found",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 2,
        "_primary_term": 1,
        "status": 404
      }
    },
    {
      "update": {
        "_index": "lib2",
        "_type": "books",
        "_id": "3",
        "_version": 2,
        "result": "noop",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "status": 200
      }
    }
  ]
}

           
查询结果
GET lib2/books/_mget
{
  "ids":[1,2,3]
}

           
{
  "docs": [
    {
      "_index": "lib2",
      "_type": "books",
      "_id": "1",
      "found": false
    },
    {
      "_index": "lib2",
      "_type": "books",
      "_id": "2",
      "found": false
    },
    {
      "_index": "lib2",
      "_type": "books",
      "_id": "3",
      "_version": 2,
      "found": true,
      "_source": {
        "title": "Python",
        "price": 58,
        "type": "cheng xu she ji"
      }
    }
  ]
}
           

** 说明已经批量删除了 和 更新了**

使用

index

create

来批量添加
POST /lib2/books/_bulk
{"create":{"_index":"tt","_type":"ttt","_id":100}}
{"name":"zhangsan"}
{"index":{"_index":"tt","_type":"ttt"}}
{"name":"lisi"}
           
返回结果
#! Deprecation: the default number of shards will change from [5] to [1] in 7.0.0; if you wish to continue using the default of [5] shards, you must manage this on the create index request or with an index template
{
  "took": 92,
  "errors": false,
  "items": [
    {
      "create": {
        "_index": "tt",
        "_type": "ttt",
        "_id": "100",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 1,
        "status": 201
      }
    },
    {
      "index": {
        "_index": "tt",
        "_type": "ttt",
        "_id": "oORklGsBI3jKMDQnH__Y",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 1,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}
           

确认是否添加进去

GET tt/ttt/_mget
{
  "ids":[100,"oORklGsBI3jKMDQnH__Y"]
}
           
返回结果
{
  "docs": [
    {
      "_index": "tt",
      "_type": "ttt",
      "_id": "100",
      "_version": 1,
      "found": true,
      "_source": {
        "name": "zhangsan"
      }
    },
    {
      "_index": "tt",
      "_type": "ttt",
      "_id": "oORklGsBI3jKMDQnH__Y",
      "_version": 1,
      "found": true,
      "_source": {
        "name": "lisi"
      }
    }
  ]
}
           

BLUK 最多一次处理多少数据量

bulk 会把将要处理的数据载入到内存中,所以数据量是有限制的,最佳的数据量不是一个确定的数值,它 取决于你的硬件,他的文档大小以及复杂性,你的索引以及搜索的负载。

一般建议是 1000 - 5000 ,大小建议是 5 - 15 MB, 默认不能超过 100 MB, 可以在es的配置文件中($ES_HOME下的config 下的 elasticsearch.yml )中 。

Kibana 控制台

POST lib2/book/_bulk
{"index":{"_id":1}}
{"title":"java","price":55}
{"index":{"_id":2}}
{"title":"html5","price":15}
{"index":{"_id":3}}
{"title":"Python","price":30}

GET lib2/books/_mget
{
  "ids":[1,2,3]
}

POST /lib2/books/_bulk
{"delete":{"_index":"lib2","_type":"books","_id":1}}
{"delete":{"_index":"lib2","_type":"books","_id":2}}
{"update":{"_index":"lib2","_type":"books","_id":3}}
{"doc":{"price":58,"type":"cheng xu she ji"}}


POST /lib2/books/_bulk
{"create":{"_index":"tt","_type":"ttt","_id":100}}
{"name":"zhangsan"}
{"index":{"_index":"tt","_type":"ttt"}}
{"name":"lisi"}

GET tt/ttt/_mget
{
  "ids":[100,"oORklGsBI3jKMDQnH__Y"]
}
           

继续阅读