天天看點

elasticsearch 之 索引管理:基于scoll、bulk、索引别名技術實作零停機重建索引

目錄

  • 1、思路
  • 2、實驗
  • 3、總結

1、思路

一個field的設定是不能被修改的,如果要修改一個Field,那麼應該重新按照新的mapping,建立一個index_new,然後将資料批量查詢出來,重新用bulk api寫入index_new中批量查詢的時候,建議采用scroll api,并且采用多線程并發的方式來reindex資料,每次scoll就查詢指定日期的一段資料,交給一個線程即可,當資料導入index_new完畢時,用戶端在切換到新的index_new即可。

2、實驗

(1)一開始,依靠dynamic mapping,插入資料,但是不小心有些資料是2017-01-01這種日期格式的,是以title這種field被自動映射為了【date】類型,實際上它應該是【string】類型的。

導入資料

PUT /my_index/my_type/1

{

  "title":"2019-01-01"

}

PUT /my_index/my_type/2

{

  "title":"2019-01-02"

}

PUT /my_index/my_type/3

{

  "title":"2019-01-03"

}

檢視mapping 類型

get /my_index/_mapping/my_type

{

  "my_index": {

    "mappings": {

      "my_type": {

        "properties": {

          "title": {

            "type": "date"

          }

        }

      }

    }

  }

}

(2)當後期向索引中加入string類型的title值的時候,就會報錯

插入資料

PUT /my_index/my_type/4

{

  "title":"hello elasticsearch"

}

回報結果

{

  "error": {

    "root_cause": [

      {

        "type": "mapper_parsing_exception",

        "reason": "failed to parse [title]"

      }

    ],

    "type": "mapper_parsing_exception",

    "reason": "failed to parse [title]",

    "caused_by": {

      "type": "illegal_argument_exception",

      "reason": "Invalid format: \"hello elasticsearch\""

    }

  },

  "status": 400

}

(3)如果此時想修改title的類型,是不可能的

PUT /my_index/_mapping/my_type

{

  "properties": {

    "title":{

      "type": "text"

    }

  }

}

回報資訊

{

  "error": {

    "root_cause": [

      {

        "type": "illegal_argument_exception",

        "reason": "mapper [title] of different type, current_type [date], merged_type [text]"

      }

    ],

    "type": "illegal_argument_exception",

    "reason": "mapper [title] of different type, current_type [date], merged_type [text]"

  },

  "status": 400

}

(4)此時,唯一的辦法,就是進行reindex,也就是說,重建立立一個索引,将舊索引的資料查詢出來,再導入新索引

(5)如果說舊索引的名字,是old_index,新索引的名字是new_index,終端java應用,已經在使用old_index在操作了,難道還要去停止java應用,修改使用的index為new_index,才重新啟動java應用嗎?這個過程中,就會導緻java應用停機,可用性降低

(6)是以說,給java應用一個别名,這個别名是指向舊索引的,java應用先用着,java應用先用goods_index alias來操作,此時實際指向的是舊的my_index

PUT /my_index/_alias/goods_index

回報資訊

{

  "acknowledged": true

}

(7)建立一個index,調整其title的類型為string

PUT /my_index_new

{

  "mappings": {

    "my_type":{

      "properties": {

        "title":{

          "type": "text"

        }

      }

    }

  }

}

(8)使用scroll api将資料批量查詢出來,

執行個體查詢一條即可:

GET /my_index/_search?scroll=1m

{

  "query": {

    "match_all": {}

  },

  "sort": ["_doc"],

  "size": 1

}

回報資訊

{

  "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAO5Fm1vRkdXOXJzU2VxaVRaaXBsLXZvZlEAAAAAAAADuhZtb0ZHVzlyc1NlcWlUWmlwbC12b2ZRAAAAAAAAA7cWbW9GR1c5cnNTZXFpVFppcGwtdm9mUQAAAAAAAAO7Fm1vRkdXOXJzU2VxaVRaaXBsLXZvZlEAAAAAAAADuBZtb0ZHVzlyc1NlcWlUWmlwbC12b2ZR",

  "took": 3,

  "timed_out": false,

  "_shards": {

    "total": 5,

    "successful": 5,

    "failed": 0

  },

  "hits": {

    "total": 3,

    "max_score": null,

    "hits": [

      {

        "_index": "my_index",

        "_type": "my_type",

        "_id": "2",

        "_score": null,

        "_source": {

          "title": "2019-01-02"

        },

        "sort": [

        ]

      }

    ]

  }

}

(9)采用bulk api将scoll查出來的一批資料,批量寫入新索引

POST /_bulk

{ "index":  { "_index": "my_index_new", "_type": "my_type", "_id": "2" }}

{ "title":    "2017-01-02" }

回報資訊

{

  "took": 2161,

  "errors": false,

  "items": [

    {

      "index": {

        "_index": "my_index_new",

        "_type": "my_type",

        "_id": "2",

        "_version": 1,

        "result": "created",

        "_shards": {

          "total": 2,

          "successful": 1,

          "failed": 0

        },

        "created": true,

        "status": 201

      }

    }

  ]

}

(10)反複循環8~9,查詢一批又一批的資料出來,采取bulk api将每一批資料批量寫入新索引

(11)将goods_index alias切換到my_index_new上去,java應用會直接通過index别名使用新的索引中的資料,java應用程式不需要停機,零送出,高可用

POST /_aliases

{

    "actions": [

        { "remove": { "index": "my_index", "alias": "goods_index" }},

        { "add":    { "index": "my_index_new", "alias": "goods_index" }}

    ]

}

回報資訊

{

  "acknowledged": true

}

(12)直接通過goods_index别名來查詢,是否ok

GET /goods_index/my_type/_search

{

  "query": {

    "match_all": {}

  }

}

回報資訊:

{

  "took": 1,

  "timed_out": false,

  "_shards": {

    "total": 5,

    "successful": 5,

    "failed": 0

  },

  "hits": {

    "total": 1,

    "max_score": 1,

    "hits": [

      {

        "_index": "my_index_new",

        "_type": "my_type",

        "_id": "2",

        "_score": 1,

        "_source": {

          "title": "2017-01-02"

        }

      }

    ]

  }

}

檢視新的index 的 type類型

GET /goods_index/_mapping/my_type

{

  "my_index_new": {

    "mappings": {

      "my_type": {

        "properties": {

          "title": {

            "type": "text"

          }

        }

      }

    }

  }

}

插入string類型的資料

PUT /goods_index/my_type/6

{

  "title":"hello elasticsearch"

}

回報資訊

{

  "_index": "my_index_new",

  "_type": "my_type",

  "_id": "6",

  "_version": 1,

  "result": "created",

  "_shards": {

    "total": 2,

    "successful": 1,

    "failed": 0

  },

  "created": true

}

查詢所有資訊

GET /goods_index/my_type/_search

{

  "query": {

    "match_all": {}

  }

}

回報資訊

{

  "took": 2,

  "timed_out": false,

  "_shards": {

    "total": 5,

    "successful": 5,

    "failed": 0

  },

  "hits": {

    "total": 2,

    "max_score": 1,

    "hits": [

      {

        "_index": "my_index_new",

        "_type": "my_type",

        "_id": "2",

        "_score": 1,

        "_source": {

          "title": "2017-01-02"

        }

      },

      {

        "_index": "my_index_new",

        "_type": "my_type",

        "_id": "6",

        "_score": 1,

        "_source": {

          "title": "hello elasticsearch"

        }

      }

    ]

  }

}

3、總結

string 類型資料 可以 添加到新的index在中,并且原來的資訊已經導入(本次執行個體導入原來的一條資料)

繼續閱讀