druid.io本地叢集搭建 / 擴充叢集搭建

druid.io 是一個比較重型的資料庫查詢系統，分為5種節點。

在此就不對資料庫進行介紹了，如果有疑問請參考白皮書：

http://pan.baidu.com/s/1eSFlIJS

單台機器的叢集搭建

首先說一下通用的叢集搭建，基于 0.9.1.1

下載下傳位址 http://pan.baidu.com/s/1hrJBjlq:

修改 conf/druid/_common 内的 common.runtime.properties，參考如下配置：

#
# Licensed to Metamarkets Group Inc. (Metamarkets) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. Metamarkets licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# Extensions
# This is not the full list of Druid extensions, but common ones that people often use. You may need to change this list
# based on your particular setup.
#druid.extensions.loadList=["druid-kafka-eight", "druid-s3-extensions", "druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "mysql-metadata-storage"]
druid.extensions.loadList=["mysql-metadata-storage"]
# If you have a different version of Hadoop, place your Hadoop client jar files in your hadoop-dependencies directory
# and uncomment the line below to point to your directory.
#druid.extensions.hadoopDependenciesDir=/my/dir/hadoop-dependencies
# Logging
# Log all runtime properties on startup. Disable to avoid logging properties on startup:
druid.startup.logging.logProperties=true
# Zookeeper
druid.zk.service.host=10.202.4.22:2181
druid.zk.paths.base=/druid
# Metadata storage
# For Derby server on your Druid Coordinator (only viable in a cluster with a single Coordinator, no fail-over):
#druid.metadata.storage.type=derby
#druid.metadata.storage.connector.connectURI=jdbc:derby://metadata.store.ip:1527/var/druid/metadata.db;create=true
#druid.metadata.storage.connector.host=metadata.store.ip
#druid.metadata.storage.connector.port=1527
# For MySQL:
druid.metadata.storage.type=mysql
druid.metadata.storage.connector.connectURI=jdbc:mysql://10.202.4.22:3306/druid?characterEncoding=UTF-8
druid.metadata.storage.connector.user=szh
druid.metadata.storage.connector.password=123456
# For PostgreSQL (make sure to additionally include the Postgres extension):
#druid.metadata.storage.type=postgresql
#druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid
#druid.metadata.storage.connector.user=...
#druid.metadata.storage.connector.password=...
# Deep storage
# For local disk (only viable in a cluster if this is a network mount):
druid.storage.type=local
druid.storage.storageDirectory=var/druid/segments
# For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
#druid.storage.type=hdfs
#druid.storage.storageDirectory=/druid/segments
# For S3:
#druid.storage.type=s3
#druid.storage.bucket=your-bucket
#druid.storage.baseKey=druid/segments
#druid.s3.accessKey=...
#druid.s3.secretKey=...
# Indexing service logs
druid.indexer.logs.type=file
druid.indexer.logs.directory=var/druid/indexing-logs
#druid.indexer.logs.type=hdfs
#druid.indexer.logs.directory=/druid/indexing-logs
#druid.indexer.logs.type=s3
#druid.indexer.logs.s3Bucket=your-bucket
#druid.indexer.logs.s3Prefix=druid/indexing-logs
# Service discovery
druid.selectors.indexing.serviceName=druid/overlord
druid.selectors.coordinator.serviceName=druid/coordinator
# Monitoring
druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
druid.emitter=logging
druid.emitter.logging.logLevel=info

0.9.1.1 預設是不帶mysql 擴充的需要自己下載下傳，解壓後放置于extensions 下：

參考文章：

http://druid.io/docs/0.9.1.1/operations/including-extensions.html

擴充包的下載下傳位址

http://druid.io/downloads.html

分别修改5個節點的啟動配置

/conf/druid/${serviceName}/runtime.properties

broker節點：

/conf/druid/broker/runtime.properties

druid.host=10.202.4.22:9102
druid.service=druid/broker
druid.port=9102
# HTTP server threads
druid.broker.http.numConnections=5
druid.server.http.numThreads=25
# Processing threads and buffers
druid.processing.buffer.sizeBytes=32768
druid.processing.numThreads=2
# Query cache
druid.broker.cache.useCache=true
druid.broker.cache.populateCache=true
druid.cache.type=local
druid.cache.sizeInBytes=2000000000

coordinator 節點：

/conf/druid/coordinator/runtime.properties

druid.host=10.202.4.22:8082
druid.service=druid/coordinator
druid.port=8082

historical 節點：

/conf/druid/historical/runtime.properties

druid.service=druid/historical
druid.host=10.202.4.22:9002
druid.port=9002
druid.processing.buffer.sizeBytes=6870912
druid.processing.numThreads=7
druid.historical.cache.useCache=false
druid.historical.cache.populateCache=false
# Segment storage
druid.segmentCache.locations=[{"path":"var/druid/segment-cache","maxSize"\:13000000}]
druid.server.maxSize=13000000

middleManager 節點：

/conf/druid/middleManager/runtime.properties

druid.host=10.202.4.22:8091
druid.service=druid/middleManager
druid.port=8091
# Number of tasks per middleManager
druid.worker.capacity=3
# Task launch parameters
druid.indexer.runner.javaOpts=-server -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
druid.indexer.task.baseTaskDir=var/druid/task
druid.processing.buffer.sizeBytes=65536
# Hadoop indexing
druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp
druid.indexer.task.defaultHadoopCoordinates=["org.apache.hadoop:hadoop-client:2.3.0"]

overlord 節點：

/conf/druid/overlord/runtime.properties

druid.service=druid/overlord
druid.host=10.202.4.22:9100
druid.port=9100
#druid.indexer.queue.startDelay=PT30S
druid.indexer.runner.type=remote
druid.indexer.storage.type=metadata

啟動叢集啟動可以利用我寫的腳本：

#java `cat conf/druid/broker/jvm.config | xargs` -cp conf/druid/_common:conf/druid/broker:lib/* io.druid.cli.Main server broker
function help(){
echo "參數清單"
echo " 參數1 參數2"
echo " serviceName [-f]"
echo "參數1：serviceName: 啟動服務的名字"
echo "serviceName可選項:"
echo "1: broker"
echo "2: coordinator"
echo "3: historical"
echo "4: middleManager"
echo "5: overlord"
echo "參數2：[-f]: 是否前台啟動"
echo "-f:前台啟動，(不加)預設背景啟動"
}
function startService(){
# echo $0
# echo $1
# echo $2
echo $service
if [[ $2 == "-f" ]]; then
echo "前台啟動"
java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/$service:lib/* io.druid.cli.Main server $service
else
echo "背景啟動"
nohup java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/$service:lib/* io.druid.cli.Main server $service &
fi;
function tips(){
red=`tput setaf 1`
reset=`tput sgr0`
echo "${red}Not correct arguments${reset}"
echo "please use --help or -h for help"
if [[ $1 == "--help" || $1 == "-h" ]]; then
help
exit
fi
service=$1
case $service in
"broker")
;;
"coordinator")
"historical")
"middleManager")
"overlord")
*)
tips
esac
if [[ $2 == "-f" || $2 == "" ]]; then
startService $1 $2;
else

将上述腳本放到 druid的根目錄即可：

啟動效果如圖：

多台機器的叢集搭建：

上面是單台機器的叢集搭建，擴充到多台

隻需要修改 conf/druid/(broker | coordinator | historical | middleManager | overlord) 的 runtime.properties 中的

druid.host=10.202.4.22:9102 改成其他機器的IP位址即可

=========================================================

最後共享下我成功搭建的本地叢集 (内含個人寫的叢集啟動腳本)：

http://pan.baidu.com/s/1bJjFzg

基于版本 0.9.1.1, 中繼資料存儲用的mysql （擴充包已經下載下傳好了）

下載下傳并解壓我搭建的druid.tgz 的druid本地叢集後：

1.修改各個配置檔案

conf/druid/broker/runtime.properties

conf/druid/coordinator/runtime.properties

conf/druid/historical/runtime.properties

conf/druid/middleManager/runtime.properties

conf/druid/overlord/runtime.properties

druid.host=10.202.4.22:9102 改成自己的IP位址

2.修改 conf/druid/_common/common.runtime.properties 中的mysql端口位址, 使用者名,密碼, zookeeper位址等替換為自己的位址。

Tips:mysql 沒有建立庫druid 要自己先建立（安裝mysql資料庫，并建立druid資料庫和druid使用者）

3.用根目錄下的 cluster_start_service.sh 啟動5個節點的服務

這時進入mysql資料庫會發現druid自動建立了幾張表

4.進行測試

（1）資料導入

（2）資料查詢

切換到 test 目錄，下面有兩個檔案夾：

（1）資料導入：

切換到 test_load 的目錄下

下面的 submit_csv_task.sh , submit_json_task.sh 分别是送出 csv 資料與 json 資料的測試腳本

這裡需要修改 env.sh

設定 overlord_ip 為自己的 overlord_ip 的位址端口

這裡試下執行 submit_json_task.sh 腳本，去頁面上檢視導入任務執行的狀态，如圖所示：

輸入的位址為 overlord 節點的位址+端口：

等待任務成功執行完成：

切換到查詢測試目錄，執行查詢腳本：

druid.io本地叢集搭建 / 擴充叢集搭建

繼續閱讀

jdk1.7+Eclipse+Maven3.5+Hadoop2.7.3建構hadoop項目

HDFS指令行工具

【51CTO學院三周年】自學路上的伴侶

線上教育巨頭多鄰國Duolingo入華一周年，中國市場馬力全開

【分類算法】什麼是分類算法定義分類與聚類分類過程方法

申請評分模型拒絕推斷（RI）方法申請評分模型拒絕推斷（RI）方法

Sql優化一：sql語句優化

Nacos 2.0 更新前後性能對比壓測

尚矽谷—韓順平—圖解 Java設計模式（結構型）（55～）

Storm編譯打包過程中遇到的一些問題及解決方法

MapReduce的幾個企業級經典面試案例MapReduce的幾個企業級經典面試案例

9.spark Core 進階2--Cashe

淺談企業活動中進行資料分析的重要性

Ambari介紹和架構原理

NOSQL安全攻擊

win10本地scala和spark安裝安裝scala安裝spark