上次說過的，要整個Solr Cluster也就是Solr Cloud,現在好像比較流行Cloud滴說。網上很多都是solr4版的solr cloud，還都是用tomcat,一大堆copy操作，solr5的很少看到。下面我就來吃個螃蟹，solr5 solr 5 solr5美好的事情要重複三遍！

就像Solr介紹的一樣

Apache Solr includes the ability to set up a cluster of Solr servers that combines fault tolerance and high availability. Called SolrCloud, these capabilities provide distributed indexing and search capabilities, supporting the following features:

Central configuration for the entire cluster
Automatic load balancing and fail-over for queries
ZooKeeper integration for cluster coordination and configuration.

這些特性對Cluster是必要的，更多的說明見Solr的文檔：https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud

1.單機多節點

[[email protected] bin]$ ./solr -e cloud 下面按照互動式的指令敲入參數（使用内嵌的ZooKeeper）

例如：nodes =2 the port for node1 8983 the node for node2 7574

new collection : testcoll

shards : 3 replicas :3

configuration:basic_configs

通過以下指令來檢視：

[[email protected] ~]$ cd tools//solr-5.3.0/bin/

[hado[email protected] bin]$ ./solr status

Found 2 Solr nodes:

Solr process 2185 running on port 7574

或者打開浏覽器：http://IP:8983/solr/#/~cloud 這個可以好好看下，有助了解相關概念(node,collection,shards,replicas)，具體的圖就不貼了。

上面這些指令在Windows下也很容易成功的，不過使用這種方式建立的是單機版的模拟cluster，兩個節點在同一台機器上。

2.多節點完全分布式SolrCloud

Node Name	IP	HostName
Node1	192.168.182.128	vm11
Node2	192.168.182.129	vm22
Node3	192.168.182.130	vm33

三個節點安裝ZooKeeper，三個節點安裝Solr，IP/HOST同上。

2.1 運作ZooKeeper(不使用Solr内嵌的)

下載下傳、解壓ZooKeeper,我用的是zookeeper-3.4.6.tar.gz，安裝目錄是/home/hadoop/tools/zookeeper-3.4.6

cd /home/hadoop/tools/zookeeper-3.4.6

mkdir data

echo "1" >data/myid (這個在後面的配置檔案用到)

cd conf

vi zoo.cfg (裡面的内容是這樣的)

dataDir=/home/hadoop/tools/zookeeper-3.4.6/data

clientPort=2181

initLimit=5

syncLimit=2

server.1=vm11:2888:3888

server.2=vm22:2888:3888

server.3=vm33:2888:3888

dataDir 把記憶體中的資料存儲成快照檔案snapshot的目錄，同時myid也存儲在這個目錄下（myid中的内容為本機server服務的辨別）是ZK存放資料的目錄，最好放在其他的目錄，不要在安裝目錄下。

clientPort 用戶端連接配接server的端口，即zk對外服務端口，一般設定為2181。

initLimit Leader允許Follower在initLimit時間内完成這個工作。預設值為10，即10 * tickTime

syncLimit Leader發出心跳包在syncLimit之後，還沒有從Follower那裡收到響應，那麼就認為這個Follower已經不線上了。預設為5，即5 * tickTime

tickTime ZK中的一個時間單元。ZK中所有時間都是以這個時間單元為基礎，進行整數倍配置的。

server.X hostname為機器ip，第一個端口n為事務發送的通信端口，第二個n為leader選舉的通信端口，預設為2888:3888

配置好了之後，就可以啟動ZK了。

./bin/zkServer.sh start

在其他的節點上重複做這個，唯一需要注意的是myid這個檔案，在其他的節點上修改為2,3,再啟動。

這樣ZK就建立起來了。

[[email protected] bin]$ ./zkServer.sh status

JMX enabled by default

Using config: /home/hadoop/tools/zookeeper-3.4.6/bin/../conf/zoo.cfg

Mode: leader

2.2 安裝運作Solr

下載下傳解壓Solr,我用的是solr-5.3.0 solr5的一大feature就是能作為獨立的App運作,不需要和其他的容器綁定，優化安裝配置我解壓到 /home/hadoop/tools/solr-5.3.0。激動人心的時刻就要到來了。

我們知道ZooKeeper 就是為了解決統一命名服務、狀态同步服務、叢集管理、分布式應用配置項的管理等而來的。

1) 首先要上傳配置檔案，放在Zk裡面，讓Solr從ZK裡面來獲得配置，管理叢集，這個的确是一件美好的事情

在Solr的目錄下面有個server/scripts/cloud-scripts目錄，裡面提供了一個腳本來做這個事情（當然在ZK裡面也有腳本來做這個，Solr裡面的這個是做了一個封裝）

./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -confdir ./server/solr/configsets/basic_configs/conf -confname myconf -z vm11:2181,vm22:2181,vm33:2181

這幾個參數的意義還是比較容易看出來的。

建立Link between collection and conf

./server/scripts/cloud-scripts/zkcli.sh -cmd linkconfig -collection mycol -confname myconf -z vm11:2181,vm22:2181,vm33:2181

可以通過ZK的指令檢視是否成功：cd /home/hadoop/tools/zookeeper-3.4.6/bin

./zkCli.sh -server vm22:2181

[zk: vm22:2181(CONNECTED) 2] ls /configs/myconf

[_rest_managed.json, currency.xml, solrconfig.xml, protwords.txt, stopwords.txt, synonyms.txt, lang, schema.xml]

[zk: vm22:2181(CONNECTED) 3] ls /collections/mycol

[state.json, leader_elect, leaders]

2) 下面就可以把每個啟動solr節點都啟動了。

./bin/solr start -cloud -p 8983 -s "/home/hadoop/tools/solr-5.3.0/server/solr" -z vm11:2181,vm22:2181,vm33:2181

-cloud指定運作為cloud模式 -p 端口

-s 就是要指定solr.xml所在的目錄，就是很多時候說的solr.solr.home(solr的官方文檔也是這樣)這個名字真的是很奇葩，開始的時候很難了解，這麼說吧，solr.home就是一般說的solr的安裝目錄，就像萬能的Java_home,hadoop_home.而solr.solr.home可以了解為每個solr運作時的配置目錄，包括core，等等。當然也可以copy裡面的幾個檔案，把-s指向其他的目錄

-z ZK配置，多個的時候中間用,隔開

三個節點都運作這個指令，運作起來後，這樣cloud就跑起來了。

3) 下面來建立Shard，通過curl來執行

curl ’http://vm11:8983/solr/admin/collections?action=CREATE&name=mycol&numShards=3&replicationFactor=3&maxShardsPerNode=3&collection.configName=myconf‘

這個myconf從上面上傳到zk裡面的配置，我們在前面早就配置好了。其他的幾個參數很明顯。

在一個節點上運作就可以了。開始打開浏覽器：http://192.168.182.128:8983

資料索引---Solr Cloud(Solr5) with ZooKeeper1.單機多節點2.多節點完全分布式SolrCloud附：Solr Command

幾個地方我高亮出來了，很容易看出來。再來看看solr.solr.home。例如Node1,這個裡面包括三個shard，每個的一個replica

資料索引---Solr Cloud(Solr5) with ZooKeeper1.單機多節點2.多節點完全分布式SolrCloud附：Solr Command

這樣這個結構就比較清晰了，很多概念就出來了。下面引用網上的幾個圖(沒有注明出處，如有冒犯，請指出以便删除)，可以更清晰的看看幾個概念之間的關系

資料索引---Solr Cloud(Solr5) with ZooKeeper1.單機多節點2.多節點完全分布式SolrCloud附：Solr Command

實體和邏輯對應圖：

資料索引---Solr Cloud(Solr5) with ZooKeeper1.單機多節點2.多節點完全分布式SolrCloud附：Solr Command

附：Solr Command

Starting with -noprompt

You can also get SolrCloud started with all the defaults instead of the interactive session using the following command:

$ bin/solr -e cloud -noprompt

Restarting Nodes

You can restart your SolrCloud nodes using the

bin/solr

script. For instance, to restart node1 running on port 8983 (with an embedded ZooKeeper server), you would do:

$ bin/solr restart -c -p 8983 -s example/cloud/node1/solr

To restart node2 running on port 7574, you can do:

$ bin/solr restart -c -p 7574 -z localhost:9983 -s example/cloud/node2/solr

Notice that you need to specify the ZooKeeper address (-z localhost:9983) when starting node2 so that it can join the cluster with node1.

Adding a node to a cluster

Adding a node to an existing cluster is a bit advanced and involves a little more understanding of Solr. Once you startup a SolrCloud cluster using the startup scripts, you can add a new node to it by:

$ mkdir <solr.home for new solr node>

$ cp <existing solr.xml path> <new solr.home>

$ bin/solr start -cloud -s solr.home/solr -p <port num> -z <zk hosts string>

Notice that the above requires you to create a Solr home directory. You either need to copy

solr.xml

to the

solr_home

directory, or keep in centrally in ZooKeeper

/solr.xml

Example (with directory structure) that adds a node to an example started with "bin/solr -e cloud":

$ mkdir -p example/cloud/node3/solr

$ cp server/solr/solr.xml example/cloud/node3/solr

$ bin/solr start -cloud -s example/cloud/node3/solr -p 8987 -z localhost:9983

The previous command will start another Solr node on port 8987 with Solr home set to

example/cloud/node3/solr

. The new node will write its log files to

example/cloud/node3/logs

Once you're comfortable with how the SolrCloud example works, we recommend using the process described in Taking Solr to Production for setting up SolrCloud nodes in production.

資料索引---Solr Cloud(Solr5) with ZooKeeper1.單機多節點2.多節點完全分布式SolrCloud附：Solr Command

1.單機多節點

2.多節點完全分布式SolrCloud

2.1 運作ZooKeeper(不使用Solr内嵌的)

2.2 安裝運作Solr

附：Solr Command

Restarting Nodes

Adding a node to a cluster

繼續閱讀

Go語言：samuel的go-zookeeper用戶端監測連接配接是否已建立

Java 用戶端通路 Kafka

linux搭建kafka單機+叢集環境+demo

Kafka：Streams實作單詞統計

Zookeeper 3.6.X支援持久化watcher節點

ZooKeeper ： Curator架構之資料緩存與監聽CuratorCache

【Solr現網問題】索引文檔數量超限

kafka環境部署(二)

Kafka學習篇（二）——Kafka環境搭建安裝JDK安裝KafkaKafka指令Kafka配置

zookeeper叢集配置簡單版本

Kafka：Topic概念與API介紹

ZooKeeper ： Curator架構之分布式屏障DistributedDoubleBarrier

ZooKeeper ： Curator架構之分布式鎖InterProcessMutex

延雲行業搜尋資料庫在大資料生态中位置和重要性大資料的挑戰大資料技術的現狀延雲行業搜尋資料庫

Nacos 2.0 更新前後性能對比壓測

30天了解30種技術系列---(10)面向Cloud的搜尋引擎 ElasticSearch