Elasticsearch Index子產品

Index配置

https://www.elastic.co/guide/en/elasticsearch/reference/7.15/index-modules.html#index-modules-settings

static

static的配置隻能在 closed index 時才能修改。

index.number_of_shards

預設1，最大1024，該參數即時close了index也不能修改。

dynamic

dynamic的配置可以通過RESTAPI update-index-settings 動态修改。

詳細可配置參數見以上官方文檔。

Index Shard Allocation索引分片配置設定

https://www.elastic.co/guide/en/elasticsearch/reference/7.15/index-modules-allocation.html

Index級别分片配置設定過濾器

Index-level shard allocation filtering

node節點的配置中，可以配置attribute打标簽，例如 node.attr.size: medium

Index的設定可以使用對應的attribute條件進行配置設定，例如

PUT test/_settings 配置設定給size是big或medium的節點
{
  "index.routing.allocation.include.size": "big,medium"
}

PUT test/_settings 配置設定給size是big且rack是rack1的節點
{
  "index.routing.allocation.include.size": "big",
  "index.routing.allocation.include.rack": "rack1"
}

條件比對規則

index.routing.allocation.include.{attribute} 至少符合1個，逗号分割

index.routing.allocation.require.{attribute} 全部符合，逗号分割

index.routing.allocation.exclude.{attribute} 全部都不能符合，逗号分割

已内置的attr有：

`_name`	Match nodes by node name
`_host_ip`	Match nodes by host IP address (IP associated with hostname)
`_publish_ip`	Match nodes by publish IP address
`_ip`	Match either `_host_ip` or `_publish_ip`
`_host`	Match nodes by hostname
`_id`	Match nodes by node id

條件比對可以使用通配符

PUT test/_settings
{
  "index.routing.allocation.include._ip": "192.168.2.*"
}

節點離開後的延遲分片配置設定

Delaying allocation when a node leaves

當節點出于任何原因離開叢集時，master節點的反應是：

将replica shard更新為primary shard（如果有replica shard）。

配置設定replica shard以替換丢失的replica shard（假設有足夠的節點）。

在其餘節點上均勻地重新平衡碎片。

如果節點隻是短暫的離開（網絡原因），節點加入後将觸發分片再均衡，若頻繁發生這種清空将給叢集帶來較大負擔，是以有節點離開後延遲配置設定。

若沒有延遲配置設定機制，則場景會是這樣：

節點5失去網絡連接配接。

對于節點5上的每個主節點，主節點将副本碎片更新到主節點。

主節點将新副本配置設定給群集中的其他節點。

每個新副本都會在網絡上生成主碎片的完整副本。

将更多的碎片移動到不同的節點以重新平衡叢集。

節點5在幾分鐘後傳回。

主節點通過将碎片配置設定給節點5來重新平衡叢集。

可以通過

index.unassigned.node_left.delayed_timeout

動态配置延遲大小，預設1m

PUT _all/_settings  可以在指定離開的index上或_all設定
{
  "settings": {
    "index.unassigned.node_left.delayed_timeout": "5m"
  }
}

在延遲配置設定機制下，就會是這樣：

主機會記錄一條消息，說明未配置設定碎片的配置設定已延遲，以及延遲了多長時間。

群集保持黃色，因為存在未配置設定的副本碎片。

節點5在幾分鐘後，即逾時到期之前傳回。

丢失的副本被重新配置設定給節點5（同步重新整理的碎片幾乎立即恢複）。

NOTE：此設定不會影響将副本更新到主副本，也不會影響以前未配置設定的副本的配置設定。特别是，延遲配置設定在叢集完全重新開機後不會生效。此外，在主故障切換情況下，經過的延遲時間被遺忘（即重置為完全初始延遲）。

運維技巧删除節點場景：即某節點永遠不會回來，并且希望Elasticsearch立即配置設定丢失的碎片，隻需将逾時更新為零

PUT _all/_settings
{
  "settings": {
    "index.unassigned.node_left.delayed_timeout": "0"
  }
}

索引恢複優先級

Index recovery prioritization

優先級按照

可選的index.priority設定

索引建立日期

索引名稱

這意味着，預設情況下，較新的索引将在較舊的索引之前恢複。

可以使用

index.priority

設定優先級，數字越大越高。

PUT index_4
{
  "settings": {
    "index.priority": 5
  }
}
或

PUT index_4/_settings
{
  "index.priority": 1
}

每個節點總分片數

Total shards per node

叢集需要盡可能的在各個節點上均衡的配置設定分片，支援以下配置

index.routing.allocation.total_shards_per_node Index次元，單個節點上最多分片數（主分片和副本分片），預設無界。

cluster.routing.allocation.total_shards_per_node 全局次元，單個節點上最多分片數（主分片和副本分片），預設無界。

Index Blocks 索引限制

https://www.elastic.co/guide/en/elasticsearch/reference/7.15/index-modules-blocks.html

可以阻止寫、讀或中繼資料操作。

支援配置：

index.blocks.read_only true則index和index metadata隻讀

index.blocks.read_only_allow_delete 隻讀但允許删除操作

index.blocks.read true則禁止讀

index.blocks.write true則禁止寫，但不影響metadata。例如，可以用寫塊關閉索引，但不能用隻讀塊關閉索引。

index.blocks.metadata true則禁止讀寫

索引的Mapper 見 mapping 章節

translog 事務日志

https://www.elastic.co/guide/en/elasticsearch/reference/7.15/index-modules-translog.html

與其他分布式系統的刷盤一樣，由于Lucene的commit代價很高，是以寫操作資料會先進入translog（系統頁緩存），當崩潰時從translog回複

預設情況下，index.translog.pertability設定為request，就會在每次寫操作請求時都執行fsync寫入translog，這意味着需要在primary和每個replica的副本上成功同步和送出後，才會向用戶端報告索引、删除、更新或批量請求的成功。

如果index.translog.pertability設定為async，就會使用定時同步機制 index.translog.sync_interval 把資料fsync到primary和每個replica的translog，這意味着在primary中尚未寫入translog的資料當primary崩潰時會丢失，當然選擇該方式可以提升一定的性能。

支援以下配置：

index.translog.sync_interval translog同步到磁盤并送出的頻率。預設為5s。不允許小于100ms的值。

index.translog.durability request（預設）和async

index.translog.flush_threshold_size translog的檔案大小，一旦達到最大大小，就會發生重新整理，生成一個新的Lucene送出點。預設值為512mb。translog如果太大，恢複時間就會長。

index.translog.retention.size 控制每個shard保留的translog檔案的總大小。預設值為512mb。

index.translog.retention.age 控制每個shard儲存translog檔案的最長持續時間。預設為12小時。

History retention 曆史保留

https://www.elastic.co/guide/en/elasticsearch/reference/7.15/index-modules-history-retention.html

在Lucene級别，寫操作隻有兩個：索引一個新文檔或删除一個文檔。由于副本複制或跨叢集複制時需要這2種資料資訊，而索引一個新文檔本身就已包含了資料資訊，但删除操作的動作資訊需要在一段時間内保留，是以有配置支援保留時間。

index.soft_deletes.retention_lease.period 預設12h。

Index Sorting 索引排序

https://www.elastic.co/guide/en/elasticsearch/reference/7.15/index-modules-index-sorting.html

index的每個shard内的文檔可以進行排序（注意是每個shard），預設不排序。

PUT my-index-000001
{
  "settings": {
    "index": {
      "sort.field": "date", 
      "sort.order": "desc"  
    }
  },
  "mappings": {
    "properties": {
      "date": {
        "type": "date"
      }
    }
  }
}

PUT my-index-000001  配置多個字段的排序
{
  "settings": {
    "index": {
      "sort.field": [ "username", "date" ], 
      "sort.order": [ "asc", "desc" ]       
    }
  },
  "mappings": {
    "properties": {
      "username": {
        "type": "keyword",
        "doc_values": true
      },
      "date": {
        "type": "date"
      }
    }
  }
}

index.sort.field 排序的字段，僅支援

boolean

numeric

date

and

keyword

index.sort.order asc 或 desc

index.sort.mode 由于排序支援多值的字段，是以需要配置取多值中的哪個值進行排序，min 或 max

index.sort.missing missing參數指定應如何處理缺少該字段的文檔，可以選擇排到 _last 或 _first

預設情況下，搜尋請求必須通路與查詢比對的每個文檔，但是當 index.sort.* 的配置與search的排序相同時，則可以使搜尋提前結束以減少通路的文檔數量。

例如下面這個例子就可以提前終止搜尋請求得到正确的結果

PUT events   按timestamp倒排
{
  "settings": {
    "index": {
      "sort.field": "timestamp",
      "sort.order": "desc" 
    }
  },
  "mappings": {
    "properties": {
      "timestamp": {
        "type": "date"
      }
    }
  }
}

GET /events/_search  按timestamp的倒排search前10條
{
  "size": 10,
  "sort": [
    { "timestamp": "desc" }
  ]
}

GET /events/_search  不但提前結束，還進一步告訴ES不需要total字段，節省了ES内部一次count查詢
{
  "size": 10,
  "sort": [ 
      { "timestamp": "desc" }
  ],
  "track_total_hits": false
}

{   不需要total字段的結果展示
  "_shards": ...
   "hits" : {  
      "max_score" : null,
      "hits" : []
  },
  "took": 20,
  "timed_out": false
}

Indexing pressure 索引壓力

https://www.elastic.co/guide/en/elasticsearch/reference/7.15/index-modules-indexing-pressure.html

由于ES對每個索引都有一定的自動處理機制，如協調、主和副本階段。如果在系統中引入太多的索引工作，叢集可能會飽和。這可能會對其他操作産生不利影響，例如搜尋、群集協調和背景處理。

indexing_pressure.memory.limit 索引請求可能使用的未完成位元組數。當達到或超過此限制時，節點将拒絕新的協調和主操作。當副本操作消耗該限制的1.5倍時，節點将拒絕新的副本操作。預設值為堆的10%。

Elasticsearch Index子產品

繼續閱讀

Compile workrave under windows &ndash; My exprience 在Windows上編譯Workrave

HDU 2821 Pusher

UVA 1401 Remember the Word

ZOJ 2748 Free Kick

CSU 1567 Reverse Rot

JAVA 系列——>開發工具IntelliJ IDEA的安裝以及配置、快捷鍵IDEA 簡介

門戶通專訪草根站長九天狼：做站貴在堅持

UVA 519 Puzzle (II)

磁盤結構及在Linux中的命名

tabpanel 使用問題

為什麼把CSS放頭部，script放下面

CSS之折疊菜單

尚矽谷—韓順平—圖解 Java設計模式（結構型）（55～）

web開發之前後端渲染

403 Forbidden，You don't have permission to access / on this server.Forbidden

30天了解30種技術系列---(10)面向Cloud的搜尋引擎 ElasticSearch