laitimes

Es7.x Data Migration in Action (Snapshot, S3)

author:A maw miscellaneous

preface

The previous article wrote about the migration scheme of ES, different scenarios to use different migration schemes will be more effective, today we will specifically operate the migration of data based on snapshot, Snapshot is more suitable for large data volume cross-cluster migration data. Also the process of creating a snapshot of an index is incremental. During the process of creating a snapshot of an index, Elasticsearch analyzes the index files stored in the repository and copies only those files that have been created or updated since the last snapshot. This allows multiple snapshots to be stored in the same repository in a compact way. The process of creating a snapshot is performed in a non-blocking manner. An index can be retrieved and queried while creating a snapshot. Nevertheless, a snapshot saves a view of the index at the point in time at which the snapshot was created. Therefore, records after the snapshot is created will not appear in this snapshot. The process of creating a snapshot begins immediately after the primary shard starts, and the location does not change thereafter.

Es7.x Data Migration in Action (Snapshot, S3)

operate

Straight to the point, directly on the operation, we use the Kibana visual desktop to operate, of course, you can also use curl directly.

Install the S3 plugin

In addition to S3, you can also use local storage fs and so on to enter the docker container

docker exec -it c76785ab5a8b bash

# 安装插件
./bin/elasticsearch-plugin insyall repository-s3

exit           

Restart the container (it is not recommended to install it this way, in case the container is hung and restarted, it will not be there, it is recommended to use the mount method)

docker restart c76785ab5a8b           

Check to see if the installation was successful

GET _cat/plugins

c76785ab5a8b analysis-ik 7.8.1
c76785ab5a8b repository-s3 7.8.1           

This shows that the installation was successful, and in general, the plug-in version corresponds to the version of ES

Create a repository

Before creating the repository, we first configure the minio's AK and SK in the background (security) and enter the docker in the same way

./bin/elasticsearch-keystore add s3.client.default.access_key
./bin/elasticsearch-keystore add s3.client.default.secret_key           

Quit restart after adding (not recommended)

Then start creating the repository

PUT _snapshot/stock_backup
{
  "type": "s3",
  "settings": {
    "bucket": "stock",
    "protocol": "http",
    "disable_chunked_encoding": "true",
    "endpoint": "172.0.0.1:9000"
  }
}           

Verify that the creation was successful

GET _snapshot/_all?pretty

{
  "stock_backup" : {
    "type" : "s3",
    "settings" : {
      "bucket" : "stock",
      "disable_chunked_encoding" : "true",
      "endpoint" : "172.0.0.1:9000",
      "protocol" : "http"
    }
  }
}           

At this point, the repository has been created, and we directly back up the data

backup

Full backup

PUT _snapshot/stock_backup/snapshot_all           

Partial backups, for example: only the indexes in them are backed up

Of course, parameters can also be added:

indices: To divide the index of the duty, the comma split.

max_wait: Maximum wait time.

wait_interval: Wait interval.

wait_for_completion: Parameter specifies whether the request to create snapshot waits for the snapshot to be created before returning.

ignore_unavailable: When this option is set to true, non-existent indexes are ignored during snapshot creation. By default, if ignore_unavailable is not set, snapshot requests will fail if the index does not exist.

include_global_state: False prevents the global state of the cluster from being stored as part of the snapshot. By default, if 1 or more indexes in a snapshot are not all primary shards available, the entire process of creating a snapshot fails. This behavior can be changed by setting partial to true.

PUT _snapshot/stock_backup/default_all
{
  "indices": "dec_default_news,dec_default_rate,dec_default_ha",
  "ignore_unavailable": true,
  "include_global_state": false
}           

View the backup status

GET _snapshot/stock_backup/default_all  # 查看单个

{
  "snapshots" : [
    {
      "snapshot" : "default_all",
      "uuid" : "4ZgKyuBWTE2vtowAczIDpQ",
      "version_id" : 7080199,
      "version" : "7.8.1",
      "indices" : [
        "dec_default_news",
        "dec_default_rate",
        "dec_default_ha"
      ],
      "include_global_state" : false,
      "state" : "SUCCESS",
      "start_time" : "2022-04-02T03:16:09.842Z",
      "start_time_in_millis" : 1648869369842,
      "end_time" : "2022-04-02T03:16:09.842Z",
      "end_time_in_millis" : 1648869369842,
      "duration_in_millis" : 0,
      "failures" : [ ],
      "shards" : {
        "total" : 3,
        "failed" : 0,
        "successful" : 3
      }
    }
  ]
}

GET _snapshot/stock_backup/_all?pretty  # 查看所有

{
  "snapshots" : [
    {
      "snapshot" : "default_all",
      "uuid" : "4ZgKyuBWTE2vtowAczIDpQ",
      "version_id" : 7080199,
      "version" : "7.8.1",
      "indices" : [
        "dec_default_news",
        "dec_default_rate",
        "dec_default_ha"
      ],
      "include_global_state" : false,
      "state" : "SUCCESS",
      "start_time" : "2022-04-02T03:16:09.842Z",
      "start_time_in_millis" : 1648869369842,
      "end_time" : "2022-04-02T03:16:09.842Z",
      "end_time_in_millis" : 1648869369842,
      "duration_in_millis" : 0,
      "failures" : [ ],
      "shards" : {
        "total" : 3,
        "failed" : 0,
        "successful" : 3
      }
    }
  ]
}           

At this point, the backup has been successful

recover

Our operation is to migrate data across clusters, the same operation, to create the same warehouse in another cluster

PUT _snapshot/stock_backup
{
  "type": "s3",
  "settings": {
    "bucket": "stock",
    "protocol": "http",
    "disable_chunked_encoding": "true",
    "endpoint": "172.0.0.1:9000"
  }
}           

Then look at the backup data now

GET _snapshot/stock_backup/_all?pretty
{
  "snapshots" : [
    {
      "snapshot" : "default_all",
      "uuid" : "4ZgKyuBWTE2vtowAczIDpQ",
      "version_id" : 7080199,
      "version" : "7.8.1",
      "indices" : [
        "dec_default_news",
        "dec_default_rate",
        "dec_default_ha"
      ],
      "include_global_state" : false,
      "state" : "SUCCESS",
      "start_time" : "2022-04-02T03:16:09.842Z",
      "start_time_in_millis" : 1648869369842,
      "end_time" : "2022-04-02T03:16:09.842Z",
      "end_time_in_millis" : 1648869369842,
      "duration_in_millis" : 0,
      "failures" : [ ],
      "shards" : {
        "total" : 3,
        "failed" : 0,
        "successful" : 3
      }
    }
  ]
}           

If you find that the backup data already exists, you only need to restore it

POST _snapshot/stock_backup/default_all/_restore

{
  "acknowledged" : true
}           

Wait for the execution to end ok

GET _cat/indices

yellow open dec_default_news   HWykC-xpQVK0ZqK-3NjXVA 1 1     308    0   208kb   208kb
yellow open dec_default_rate   F3JFzHF-QK2AH_9IUmnacA 1 1  409471    0 221.5mb 221.5mb
yellow open dec_default_ha     c78OXNB1T3KafgVHj7TwiA 1 1     164    0 250.2kb 250.2kb           

Home and dry

Es7.x Data Migration in Action (Snapshot, S3)