ElasticSearch基本原理和分布式檔案系統

階段一：Elasticsearch概念與架構

Elasticsearch的功能

Elasticsearch-Linux安裝

Elasticsearch核心概念

Elasticsearch基礎分布式架構

Elasticsearch的shard和replica機制、單node環境shard配置設定

橫向擴容過程，如何超出擴容極限，以及如何提升容錯性

階段二：ElasticSearch分布式檔案架構

7.分布式檔案系統-document各種操作内部原理

階段一：Elasticsearch概念與架構

Elasticsearch的功能

（1）分布式的搜尋引擎和資料分析引擎

（2）全文檢索，結構化檢索，資料分析

（3）對海量資料進行近實時的處理

Elasticsearch-Linux安裝

1.下載下傳

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.0.0.tar.gz

2.解壓授權（es 規定 root 使用者不能啟動 es，是以需要使用一個其他使用者來啟動 es）

useradd esuser

passwd esuser

cd /home/esuser/

tar zxvf elasticsearch-6.0.0.tar.gz

chown -R esuser:esuser elasticsearch-6.0.0

3.啟動（切換到普通使用者）

cd elasticsearch-6.0.0

sh ./bin/elasticsearch

（如果報錯jdk版本問題，可以修改系統環境變量；但是本機環境變量使用系統自帶的jdk1.7，由于其他業務需要，不能改變，可以在bin/elasticsearch-env下配置臨時變量

JAVA_HOME=/usr/java/jdk1.8.0_144）

4.通路測試

curl localhost:9200

注：這裡不能直接使用IP，需要配置（在配置前先停了線程），下面是開始配置。

5.停止es

cd elasticsearch-6.0.0/bin

ps -ef |grep elasticsearch

kill -9 上面查出來的程序号（第一行使用者名後第一個）

6.修改

config/elasticsearch.yml檔案裡面的：network.host: 0.0.0.0

7.重新開機

sh elasticsearch （-d背景啟動）

發現報錯：

前三個錯誤：

ERROR: [4] bootstrap checks failed

#檔案句柄太少，至少要65536

[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]

#最大線程數太少，至少2048個(經典的2048遊戲)

[2]: max number of threads [1024] for user [king] is too low, increase to at least [2048]

#虛拟記憶體太少，至少262144

[3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

修改：

1.更改檔案句柄數

[root@localhost ~]# vi /etc/security/limits.conf

在檔案中加入如下内容(*表示任何使用者)

* soft nofile 65536

* hard nofile 131072

* soft nproc 2048

* hard nproc 4096

2.增加線程數

[root@localhost ~]# vi /etc/security/limits.d/90-nproc.conf

将其中的

* soft nproc 1024

修改為

* soft nproc 2048

3.增加虛拟記憶體

[root@localhost ~]# vim /etc/sysctl.conf

在其中添加

vm.max_map_count=655360

使配置生效（完成後最好換個用戶端重新開機）：

sysctl -p

第四個錯誤：

[4]system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk

原因：

這是在因為Centos6不支援SecComp，而ES5.2.0預設bootstrap.system_call_filter為true進行檢測，是以導緻檢測失敗，失敗後直接導緻ES不能啟動。

解決：

在elasticsearch.yml中配置bootstrap.system_call_filter為false，注意要在Memory下面:

bootstrap.memory_lock: false

bootstrap.system_call_filter: false

8.再次重新開機（最好換另外一個用戶端）

[esuser@localhost bin]$ ./elasticsearch

[root@localhost ~]# curl xxx.xx.30.27:9200

9.配置防火牆

這個時候可以在本機通過本機ip通路，還沒有開防火牆，外網是不可以通路的。

1) 重新開機後生效

開啟： chkconfig iptables on

關閉： chkconfig iptables off

2) 即時生效，重新開機後失效

開啟： service iptables start

關閉： service iptables stop

我是臨時關閉防火牆。

Elasticsearch-Windows安裝

1、安裝JDK，至少1.8.0_73以上版本，java -version

2、下載下傳和解壓縮Elasticsearch安裝包，目錄結構

3、啟動Elasticsearch：bin\elasticsearch.bat，es本身特點之一就是開箱即用，如果是中小型應用，資料量少，操作不是很複雜，直接啟動就可以用了

4、檢查ES是否啟動成功：http://localhost:9200/?pretty

name: node名稱

cluster_name: 叢集名稱（預設的叢集名稱就是elasticsearch）

version.number: 5.2.0，es版本号

{

"name" : "4onsTYV",

"cluster_name" : "elasticsearch",

"cluster_uuid" : "nKZ9VK_vQdSQ1J0Dx9gx1Q",

"version" : {

"number" : "5.2.0",

"build_hash" : "24e05b9",

"build_date" : "2017-01-24T19:52:35.800Z",

"build_snapshot" : false,

"lucene_version" : "6.4.0"

"tagline" : "You Know, for Search"

}

5、修改叢集名稱：elasticsearch.yml

6、下載下傳和解壓縮Kibana安裝包，使用裡面的開發界面，去操作elasticsearch，作為我們學習es知識點的一個主要的界面入口

7、啟動Kibana：bin\kibana.bat

8、進入Dev Tools界面

9、GET _cluster/health

Elasticsearch核心概念

（1）Near Realtime（NRT）：近實時，兩個意思，從寫入資料到資料可以被搜尋到有一個小延遲（大概1秒）；基于es執行搜尋和分析可以達到秒級

（2）Cluster：叢集，包含多個節點，每個節點屬于哪個叢集是通過一個配置（叢集名稱，預設是elasticsearch）來決定的，對于中小型應用來說，剛開始一個叢集就一個節點很正常

（3）Node：節點，叢集中的一個節點，節點也有一個名稱（預設是随機配置設定的），節點名稱很重要（在執行運維管理操作的時候），預設節點會去加入一個名稱為“elasticsearch”的叢集，如果直接啟動一堆節點，那麼它們會自動組成一個elasticsearch叢集，當然一個節點也可以組成一個elasticsearch叢集

（4）Document&field：文檔，es中的最小資料單元，一個document可以是一條客戶資料，一條商品分類資料，一條訂單資料，通常用JSON資料結構表示，每個index下的type中，都可以去存儲多個document。一個document裡面有多個field，每個field就是一個資料字段。

product document

"product_id": "1",

"product_name": "高露潔牙膏",

"product_desc": "高效美白",

"category_id": "2",

"category_name": "日化用品"

（5）Index：索引，包含一堆有相似結構的文檔資料，比如可以有一個客戶索引，商品分類索引，訂單索引，索引有一個名稱。一個index包含很多document，一個index就代表了一類類似的或者相同的document。比如說建立一個product index，商品索引，裡面可能就存放了所有的商品資料，所有的商品document。

（6）Type：類型，每個索引裡都可以有一個或多個type，type是index中的一個邏輯資料分類，一個type下的document，都有相同的field，比如部落格系統，有一個索引，可以定義使用者資料type，部落格資料type，評論資料type。

商品index，裡面存放了所有的商品資料，商品document

但是商品分很多種類，每個種類的document的field可能不太一樣，比如說電器商品，可能還包含一些諸如售後時間範圍這樣的特殊field；生鮮商品，還包含一些諸如生鮮保存期限之類的特殊field

type，日化商品type，電器商品type，生鮮商品type

日化商品type：product_id，product_name，product_desc，category_id，category_name

電器商品type：product_id，product_name，product_desc，category_id，category_name，service_period

生鮮商品type：product_id，product_name，product_desc，category_id，category_name，eat_period

每一個type裡面，都會包含一堆document

"product_id": "2",

"product_name": "長虹電視機",

"product_desc": "4k高清",

"category_id": "3",

"category_name": "電器",

"service_period": "1年"

"product_id": "3",

"product_name": "基圍蝦",

"product_desc": "純天然，冰島産",

"category_id": "4",

"category_name": "生鮮",

"eat_period": "7天"

（7）shard：單台機器無法存儲大量資料，es可以将一個索引中的資料切分為多個shard，分布在多台伺服器上存儲。有了shard就可以橫向擴充，存儲更多資料，讓搜尋和分析等操作分布到多台伺服器上去執行，提升吞吐量和性能。每個shard都是一個lucene index。

（8）replica：任何一個伺服器随時可能故障或當機，此時shard可能就會丢失，是以可以為每個shard建立多個replica副本。replica可以在shard故障時提供備用服務，保證資料不丢失，多個replica還可以提升搜尋操作的吞吐量和性能。primary shard（建立索引時一次設定，不能修改，預設5個），replica shard（随時修改數量，預設1個），預設每個索引10個shard，5個primary shard，5個replica shard，最小的高可用配置，是2台伺服器。

Elasticsearch基礎分布式架構

1、Elasticsearch對複雜分布式機制的透明隐藏特性

2、Elasticsearch的垂直擴容與水準擴容

3、增減或減少節點時的資料rebalance

4、master節點

5、節點對等的分布式架構

Elasticsearch的shard和replica機制、單node環境shard配置設定

（1）index包含多個shard

（2）每個shard都是一個最小工作單元，承載部分資料，lucene執行個體，完整的建立索引和處理請求的能力

（3）增減節點時，shard會自動在nodes中負載均衡

（4）primary shard和replica shard，每個document肯定隻存在于某一個primary shard以及其對應的replica shard中，不可能存在于多個primary shard

（5）replica shard是primary shard的副本，負責容錯，以及承擔讀請求負載

（6）primary shard的數量在建立索引的時候就固定了，replica shard的數量可以随時修改

（7）primary shard的預設數量是5，replica預設是1，預設有10個shard，5個primary shard，5個replica shard

（8）primary shard不能和自己的replica shard放在同一個節點上（否則節點當機，primary shard和副本都丢失，起不到容錯的作用），但是可以和其他primary shard的replica shard放在同一個節點上

------------------------------------------------------------------------------------------------

（1）單node環境下，建立一個index，有3個primary shard，3個replica shard

（2）叢集status是yellow

（3）這個時候，隻會将3個primary shard配置設定到僅有的一個node上去，另外3個replica shard是無法配置設定的

（4）叢集可以正常工作，但是一旦出現節點當機，資料全部丢失，而且叢集不可用，無法承接任何請求

PUT /test_index

"settings" : {

"number_of_shards" : 3,

"number_of_replicas" : 1

}

（1）replica shard配置設定：3個primary shard，3個replica shard，1 node

（2）primary ---> replica同步

（3）讀請求：primary/replica

（1）primary&replica自動負載均衡，6個shard，3 primary，3 replica

（2）每個node有更少的shard，IO/CPU/Memory資源給每個shard配置設定更多，每個shard性能更好

（3）擴容的極限，6個shard（3 primary，3 replica），最多擴容到6台機器，每個shard可以占用單台伺服器的所有資源，性能最好

（4）超出擴容極限，動态修改replica數量，9個shard（3primary，6 replica），擴容到9台機器，比3台機器時，擁有3倍的讀吞吐量

（5）3台機器下，9個shard（3 primary，6 replica），資源更少，但是容錯性更好，最多容納2台機器當機，6個shard隻能容納0台機器當機

（6）這裡的這些知識點，你綜合起來看，就是說，一方面告訴你擴容的原理，怎麼擴容，怎麼提升系統整體吞吐量；另一方面要考慮到系統的容錯性，怎麼保證提高容錯性，讓盡可能多的伺服器當機，保證資料不丢失

（1）9 shard，3 node

（2）master node當機，自動master選舉，red

（3）replica容錯：新master将replica提升為primary shard，yellow

（4）重新開機當機node，master copy replica到該node，使用原有的shard并同步當機後的修改，green

階段二：ElasticSearch分布式檔案架構

1、_index中繼資料

2、_type中繼資料

3、_id中繼資料

{

"_index": "test_index",

"_type": "test_type",

"_id": "1",

"_version": 1,

"found": true,

"_source": {

"test_content": "test test"

}

1.1、_index中繼資料

（1）代表一個document存放在哪個index中

（2）類似的資料放在一個索引，非類似的資料放不同索引：product index（包含了所有的商品），sales index（包含了所有的商品銷售資料），inventory index（包含了所有庫存相關的資料）。如果你把比如product，sales，human resource（employee），全都放在一個大的index裡面，比如說company index，不合适的。

（3）index中包含了很多類似的document：類似是什麼意思，其實指的就是說，這些document的fields很大一部分是相同的，你說你放了3個document，每個document的fields都完全不一樣，這就不是類似了，就不太适合放到一個index裡面去了。

（4）索引名稱必須是小寫的，不能用下劃線開頭，不能包含逗号：product，website，blog

1.2、_type中繼資料

（1）代表document屬于index中的哪個類别（type）

（2）一個索引通常會劃分為多個type，邏輯上對index中有些許不同的幾類資料進行分類：因為一批相同的資料，可能有很多相同的fields，但是還是可能會有一些輕微的不同，可能會有少數fields是不一樣的，舉個例子，就比如說，商品，可能劃分為電子商品，生鮮商品，日化商品，等等。

（3）type名稱可以是大寫或者小寫，但是同時不能用下劃線開頭，不能包含逗号

1.3、_id中繼資料

（1）代表document的唯一辨別，與index和type一起，可以唯一辨別和定位一個document

（2）我們可以手動指定document的id（put /index/type/id），也可以不指定，由es自動為我們建立一個id

1、手動指定document id

（1）根據應用情況來說，是否滿足手動指定document id的前提：

一般來說，是從某些其他的系統中，導入一些資料到es時，會采取這種方式，就是使用系統中已有資料的唯一辨別，作為es中document的id。舉個例子，比如說，我們現在在開發一個電商網站，做搜尋功能，或者是OA系統，做員工檢索功能。這個時候，資料首先會在網站系統或者IT系統内部的資料庫中，會先有一份，此時就肯定會有一個資料庫的primary key（自增長，UUID，或者是業務編号）。如果将資料導入到es中，此時就比較适合采用資料在資料庫中已有的primary key。

如果說，我們是在做一個系統，這個系統主要的資料存儲就是es一種，也就是說，資料産生出來以後，可能就沒有id，直接就放es一個存儲，那麼這個時候，可能就不太适合說手動指定document id的形式了，因為你也不知道id應該是什麼，此時可以采取下面要講解的讓es自動生成id的方式。

（2）put /index/type/id

PUT /test_index/test_type/2

"test_content": "my test"

2、自動生成document id

（1）post /index/type

POST /test_index/test_type

"_index": "test_index",

"_type": "test_type",

"_id": "AVp4RN0bhjxldOOnBxaE",

"_version": 1,

"result": "created",

"_shards": {

"total": 2,

"successful": 1,

"failed": 0

"created": true

（2）自動生成的id，長度為20個字元，URL安全，base64編碼，GUID，分布式系統并行生成時不可能會發生沖突

2.1、_source中繼資料

put /test_index/test_type/1

"test_field1": "test field1",

"test_field2": "test field2"

}

get /test_index/test_type/1

"_id": "1",

"_version": 2,

"found": true,

"_source": {

"test_field1": "test field1",

"test_field2": "test field2"

}

_source中繼資料：就是說，我們在建立一個document的時候，使用的那個放在request body中的json串，預設情況下，在get的時候，會原封不動的給我們傳回回來。

2.2、定制傳回結果

定制傳回的結果，指定_source中，傳回哪些field

GET /test_index/test_type/1?_source=test_field1,test_field2

3.1、document的全量替換

（1）文法與建立文檔是一樣的，如果document id不存在，那麼就是建立；如果document id已經存在，那麼就是全量替換操作，替換document的json串内容

（2）document是不可變的，如果要修改document的内容，第一種方式就是全量替換，直接對document重建立立索引，替換裡面所有的内容

（3）es會将老的document标記為deleted，然後新增我們給定的一個document，當我們建立越來越多的document的時候，es會在适當的時機在背景自動删除标記為deleted的document

3.2、document的強制建立

（1）建立文檔與全量替換的文法是一樣的，有時我們隻是想建立文檔，不想替換文檔，如果強制進行建立呢？

（2）PUT /index/type/id?op_type=create，PUT /index/type/id/_create

3.3、document的删除

（1）DELETE /index/type/id

（2）不會了解實體删除，隻會将其标記為deleted，當資料越來越多的時候，在背景自動删除

4.1、批量查詢的好處

就是一條一條的查詢，比如說要查詢100條資料，那麼就要發送100次網絡請求，這個開銷還是很大的

如果進行批量查詢的話，查詢100條資料，就隻要發送1次網絡請求，網絡請求的性能開銷縮減100倍

4.2、mget的文法

<code>（</code><code>1</code><code>）一條一條的查詢</code>

<code>GET /test_index/test_type/</code><code>1</code>

<code>GET /test_index/test_type/</code><code>2</code>

<code>（</code><code>2</code><code>）mget批量查詢</code>

<code> </code><code>"_index"</code> <code>: </code><code>"test_index"</code><code>,</code>

<code> </code><code>"_index"</code><code>: </code><code>"test_index"</code><code>,</code>

<code> </code><code>"_version"</code><code>: </code><code>2</code><code>,</code>

<code> </code><code>"found"</code><code>: </code><code>true</code><code>,</code>

<code> </code><code>"_source"</code><code>: {</code>

<code> </code><code>"test_field1"</code><code>: </code><code>"test field1"</code><code>,</code>

<code> </code><code>"test_field2"</code><code>: </code><code>"test field2"</code>

<code> </code><code>"_version"</code><code>: </code><code>1</code><code>,</code>

<code> </code><code>"test_content"</code><code>: </code><code>"my test"</code>

<code>（</code><code>3</code><code>）如果查詢的document是一個index下的不同type種的話</code>

<code>GET /test_index/_mget</code>

<code>（</code><code>4</code><code>）如果查詢的資料都在同一個index下的同一個type下，最簡單了</code>

<code>GET /test_index/test_type/_mget</code>

4.3、mget的重要性

可以說mget是很重要的，一般來說，在進行查詢的時候，如果一次性要查詢多條資料的話，那麼一定要用batch批量操作的api

盡可能減少網絡開銷次數，可能可以将性能提升數倍，甚至數十倍，非常非常之重要

5.1、bulk文法

<code>{ </code><code>"delete"</code><code>: { </code><code>"_index"</code><code>: </code><code>"test_index"</code><code>, </code><code>"_type"</code><code>: </code><code>"test_type"</code><code>, </code><code>"_id"</code><code>: </code><code>"3"</code> <code>}}</code>

<code>{ </code><code>"create"</code><code>: { </code><code>"_index"</code><code>: </code><code>"test_index"</code><code>, </code><code>"_type"</code><code>: </code><code>"test_type"</code><code>, </code><code>"_id"</code><code>: </code><code>"12"</code> <code>}}</code>

<code>{ </code><code>"test_field"</code><code>: </code><code>"test12"</code> <code>}</code>

<code>{ </code><code>"index"</code><code>: { </code><code>"_index"</code><code>: </code><code>"test_index"</code><code>, </code><code>"_type"</code><code>: </code><code>"test_type"</code><code>, </code><code>"_id"</code><code>: </code><code>"2"</code> <code>}}</code>

<code>{ </code><code>"test_field"</code><code>: </code><code>"replaced test2"</code> <code>}</code>

<code>{ </code><code>"update"</code><code>: { </code><code>"_index"</code><code>: </code><code>"test_index"</code><code>, </code><code>"_type"</code><code>: </code><code>"test_type"</code><code>, </code><code>"_id"</code><code>: </code><code>"1"</code><code>, </code><code>"_retry_on_conflict"</code> <code>: </code><code>3</code><code>} }</code>

<code>{ </code><code>"doc"</code> <code>: {</code><code>"test_field2"</code> <code>: </code><code>"bulk test1"</code><code>} }</code>

每一個操作要兩個json串，文法如下：

{"action": {"metadata"}}

{"data"}

舉例，比如你現在要建立一個文檔，放bulk裡面，看起來會是這樣子的：

{"index": {"_index": "test_index", "_type", "test_type", "_id": "1"}}

{"test_field1": "test1", "test_field2": "test2"}

有哪些類型的操作可以執行呢？

（1）delete：删除一個文檔，隻要1個json串就可以了

（2）create：PUT /index/type/id/_create，強制建立

（3）index：普通的put操作，可以是建立文檔，也可以是全量替換文檔

（4）update：執行的partial update操作

bulk api對json的文法，有嚴格的要求，每個json串不能換行，隻能放一行，同時一個json串和一個json串之間，必須有一個換行

bulk操作中，任意一個操作失敗，是不會影響其他的操作的，但是在傳回結果裡，會告訴你異常日志

<code>POST /test_index/_bulk</code>

<code>{ </code><code>"delete"</code><code>: { </code><code>"_type"</code><code>: </code><code>"test_type"</code><code>, </code><code>"_id"</code><code>: </code><code>"3"</code> <code>}}</code>

<code>{ </code><code>"create"</code><code>: { </code><code>"_type"</code><code>: </code><code>"test_type"</code><code>, </code><code>"_id"</code><code>: </code><code>"12"</code> <code>}}</code>

<code>{ </code><code>"index"</code><code>: { </code><code>"_type"</code><code>: </code><code>"test_type"</code> <code>}}</code>

<code>{ </code><code>"test_field"</code><code>: </code><code>"auto-generate id test"</code> <code>}</code>

<code>{ </code><code>"index"</code><code>: { </code><code>"_type"</code><code>: </code><code>"test_type"</code><code>, </code><code>"_id"</code><code>: </code><code>"2"</code> <code>}}</code>

<code>{ </code><code>"update"</code><code>: { </code><code>"_type"</code><code>: </code><code>"test_type"</code><code>, </code><code>"_id"</code><code>: </code><code>"1"</code><code>, </code><code>"_retry_on_conflict"</code> <code>: </code><code>3</code><code>} }</code>

<code>POST /test_index/test_type/_bulk</code>

<code>{ </code><code>"delete"</code><code>: { </code><code>"_id"</code><code>: </code><code>"3"</code> <code>}}</code>

<code>{ </code><code>"create"</code><code>: { </code><code>"_id"</code><code>: </code><code>"12"</code> <code>}}</code>

<code>{ </code><code>"index"</code><code>: { }}</code>

<code>{ </code><code>"index"</code><code>: { </code><code>"_id"</code><code>: </code><code>"2"</code> <code>}}</code>

<code>{ </code><code>"update"</code><code>: { </code><code>"_id"</code><code>: </code><code>"1"</code><code>, </code><code>"_retry_on_conflict"</code> <code>: </code><code>3</code><code>} }</code>

5.2、bulk size最佳大小

bulk request會加載到記憶體裡，如果太大的話，性能反而會下降，是以需要反複嘗試一個最佳的bulk size。一般從1000~5000條資料開始，嘗試逐漸增加。另外，如果看大小的話，最好是在5~15MB之間

（1）_version中繼資料

<code>PUT /test_index/test_type/</code><code>6</code>

<code> </code><code>"test_field"</code><code>: </code><code>"test test"</code>

<code> </code><code>"_index"</code><code>: </code><code>"test_index"</code><code>,</code>

<code> </code><code>"_version"</code><code>: </code><code>1</code><code>,</code>

<code> </code><code>"result"</code><code>: </code><code>"created"</code><code>,</code>

<code> </code><code>"_shards"</code><code>: {</code>

<code> </code><code>"total"</code><code>: </code><code>2</code><code>,</code>

<code> </code><code>"successful"</code><code>: </code><code>1</code><code>,</code>

<code> </code><code>"failed"</code><code>: </code><code>0</code>

<code> </code><code>"created"</code><code>: </code><code>true</code>

<code>第一次建立一個document的時候，它的_version内部版本号就是</code><code>1</code><code>；以後，每次對這個document執行修改或者删除操作，都會對這個_version版本号自動加</code><code>1</code><code>；哪怕是删除，也會對這條資料的版本号加</code><code>1</code>

<code> </code><code>"found"</code><code>: </code><code>true</code><code>,</code>

<code> </code><code>"_version"</code><code>: </code><code>4</code><code>,</code>

<code> </code><code>"result"</code><code>: </code><code>"deleted"</code><code>,</code>

會發現，在删除一個document之後，可以從一個側面證明，它不是立即實體删除掉的，因為它的一些版本号等資訊還是保留着的。先删除一條document，再重新建立這條document，其實會在delete version基礎之上，再把version号加1

es提供了一個feature，就是說，你可以不用它提供的内部_version版本号來進行并發控制，可以基于你自己維護的一個版本号來進行并發控制。舉個列子，加入你的資料在mysql裡也有一份，然後你的應用系統本身就維護了一個版本号，無論是什麼自己生成的，程式控制的。這個時候，你進行樂觀鎖并發控制的時候，可能并不是想要用es内部的_version來進行控制，而是用你自己維護的那個version來進行控制。

什麼是partial update？

PUT /index/type/id，建立文檔&替換文檔，就是一樣的文法

一般對應到應用程式中，每次的執行流程基本是這樣的：

（1）應用程式先發起一個get請求，擷取到document，展示到前台界面，供使用者檢視和修改

（2）使用者在前台界面修改資料，發送到背景

（3）背景代碼，會将使用者修改的資料在記憶體中進行執行，然後封裝好修改後的全量資料

（4）然後發送PUT請求，到es中，進行全量替換

（5）es将老的document标記為deleted，然後重新建立一個新的document

partial update

post /index/type/id/_update

"doc": {

"要修改的少數幾個field即可，不需要全量的資料"

看起來，好像就比較友善了，每次就傳遞少數幾個發生修改的field即可，不需要将全量的document資料發送過去

圖解partial update實作原理以及其優點

Partial update相比全量請求的優缺點：

所有的查詢、修改和寫回操作，都發生在es中的一個shard内部，避免了所有的網絡資料傳輸的開銷（如果全量請求的話，會從es中找一批資料放回Java應用中，然後Java應用修改，傳回es中修改請求，這就是兩次網絡開銷，而partial update隻在一個shard中操作所有）

7.1.document資料路由原理（shard為什麼不可變）

（1）document路由到shard上是什麼意思？

（2）路由算法：shard = hash(routing) % number_of_primary_shards

舉個例子，一個index有3個primary shard，P0，P1，P2

每次增删改查一個document的時候，都會帶過來一個routing number，預設就是這個document的_id（可能是手動指定，也可能是自動生成）

routing = _id，假設_id=1

會将這個routing值，傳入一個hash函數中，産出一個routing值的hash值，hash(routing) = 21

然後将hash函數産出的值對這個index的primary shard的數量求餘數，21 % 3 = 0

就決定了，這個document就放在P0上。

決定一個document在哪個shard上，最重要的一個值就是routing值，預設是_id，也可以手動指定，相同的routing值，每次過來，從hash函數中，産出的hash值一定是相同的

無論hash值是幾，無論是什麼數字，對number_of_primary_shards求餘數，結果一定是在0~number_of_primary_shards-1之間這個範圍内的。0,1,2。

（3）_id or custom routing value

預設的routing就是_id

也可以在發送請求的時候，手動指定一個routing value，比如說put /index/type/id?routing=user_id

手動指定routing value是很有用的，可以保證說，某一類document一定被路由到一個shard上去，那麼在後續進行應用級别的負載均衡，以及提升批量讀取的性能的時候，是很有幫助的

（4）primary shard數量不可變的謎底

7.2.es增删改内部原理

（1）用戶端選擇一個node發送請求過去，這個node就是coordinating node（協調節點）

（2）coordinating node，對document進行路由，将請求轉發給對應的node（有primary shard）

（3）實際的node上的primary shard處理請求，然後将資料同步到replica node

（4）coordinating node，如果發現primary node和所有replica node都搞定之後，就傳回響應結果給用戶端

7.3.寫一緻性原理以及quorum機制

（1）consistency，one（primary shard），all（all shard），quorum（default）

我們在發送任何一個增删改操作的時候，比如說put /index/type/id，都可以帶上一個consistency參數，指明我們想要的寫一緻性是什麼？

put /index/type/id?consistency=quorum

one：要求我們這個寫操作，隻要有一個primary shard是active活躍可用的，就可以執行

all：要求我們這個寫操作，必須所有的primary shard和replica shard都是活躍的，才可以執行這個寫操作

quorum：預設的值，要求所有的shard中，必須是大部分的shard都是活躍的，可用的，才可以執行這個寫操作

（2）quorum機制，寫之前必須確定大多數shard都可用，int( (primary + number_of_replicas) / 2 ) + 1，當number_of_replicas>1時才生效

quroum = int( (primary + number_of_replicas) / 2 ) + 1

舉個例子，3個primary shard，number_of_replicas=1，總共有3 + 3 * 1 = 6個shard

quorum = int( (3 + 1) / 2 ) + 1 = 3

是以，要求6個shard中至少有3個shard是active狀态的，才可以執行這個寫操作

（3）如果節點數少于quorum數量，可能導緻quorum不齊全，進而導緻無法執行任何寫操作

3個primary shard，replica=1，要求至少3個shard是active，3個shard按照之前學習的shard&replica機制，必須在不同的節點上，如果說隻有2台機器的話，是不是有可能出現說，3個shard都沒法配置設定齊全，此時就可能會出現寫操作無法執行的情況

es提供了一種特殊的處理場景，就是說當number_of_replicas>1時才生效，因為假如說，你就一個primary shard，replica=1，此時就2個shard

(1 + 1 / 2) + 1 = 2，要求必須有2個shard是活躍的，但是可能就1個node，此時就1個shard是活躍的，如果你不特殊處理的話，導緻我們的單節點叢集就無法工作

（4）quorum不齊全時，wait，預設1分鐘，timeout，100，30s

等待期間，期望活躍的shard數量可以增加，最後實在不行，就會timeout

我們其實可以在寫操作的時候，加一個timeout參數，比如說put /index/type/id?timeout=30，這個就是說自己去設定quorum不齊全的時候，es的timeout時長，可以縮短，也可以增長

7.4.es查詢内部原理

1、用戶端發送請求到任意一個node，成為coordinate node

2、coordinate node對document進行路由，将請求轉發到對應的node，此時會使用round-robin随機輪詢算法，在primary shard以及其所有replica中随機選擇一個，讓讀請求負載均衡

3、接收請求的node傳回document給coordinate node

4、coordinate node傳回document給用戶端

5、特殊情況：document如果還在建立索引過程中，可能隻有primary shard有，任何一個replica shard都沒有，此時可能會導緻無法讀取到document，但是document完成索引建立之後，primary shard和replica shard就都有了

7.5.json格式

bulk api奇特的json格式

{"action": {"meta"}}\n

{"data"}\n

[{

"action": {

"data": {

}

}]

1、bulk中的每個操作都可能要轉發到不同的node的shard去執行

2、如果采用比較良好的json數組格式

允許任意的換行，整個可讀性非常棒，讀起來很爽，es拿到那種标準格式的json串以後，要按照下述流程去進行處理

（1）将json數組解析為JSONArray對象，這個時候，整個資料，就會在記憶體中出現一份一模一樣的拷貝，一份資料是json文本，一份資料是JSONArray對象

（2）解析json數組裡的每個json，對每個請求中的document進行路由

（3）為路由到同一個shard上的多個請求，建立一個請求數組

（4）将這個請求數組序列化

（5）将序列化後的請求數組發送到對應的節點上去

3、耗費更多記憶體，更多的jvm gc開銷

我們之前提到過bulk size最佳大小的那個問題，一般建議說在幾千條那樣，然後大小在10MB左右，是以說，可怕的事情來了。假設說現在100個bulk請求發送到了一個節點上去，然後每個請求是10MB，100個請求，就是1000MB = 1GB，然後每個請求的json都copy一份為jsonarray對象，此時記憶體中的占用就會翻倍，就會占用2GB的記憶體，甚至還不止。因為弄成jsonarray之後，還可能會多搞一些其他的資料結構，2GB+的記憶體占用。

占用更多的記憶體可能就會積壓其他請求的記憶體使用量，比如說最重要的搜尋請求，分析請求，等等，此時就可能會導緻其他請求的性能急速下降

另外的話，占用記憶體更多，就會導緻java虛拟機的垃圾回收次數更多，跟頻繁，每次要回收的垃圾對象更多，耗費的時間更多，導緻es的java虛拟機停止工作線程的時間更多

4、現在的奇特格式

（1）不用将其轉換為json對象，不會出現記憶體中的相同資料的拷貝，直接按照換行符切割json

（2）對每兩個一組的json，讀取meta，進行document路由

（3）直接将對應的json發送到node上去

5、最大的優勢在于，不需要将json數組解析為一個JSONArray對象，形成一份大資料的拷貝，浪費記憶體空間，盡可能地保證性能

本文轉自叫我北北 51CTO部落格，原文連結:http://blog.51cto.com/qinbin/2050285

ElasticSearch基本原理和分布式檔案系統

繼續閱讀

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的簡單使用

neo4j之cypher使用文檔

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

mybatis_入門程式Mybatis入門

vue-cli簡介（中文翻譯）

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

Ajax發送和擷取json資料到Spring mvc 1.spring mvc後端2.web前段

Effective Java 8:通用程式設計

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method

JSONObject包導入異常 java.lang.NoClassDefFoundErrorweb項目的導入包的問題