天天看點

druid.io本地叢集搭建 / 擴充叢集搭建

druid.io 是一個比較重型的資料庫查詢系統,分為5種節點 。

在此就不對資料庫進行介紹了,如果有疑問請參考白皮書:

http://pan.baidu.com/s/1eSFlIJS  

單台機器的叢集搭建

首先說一下通用的叢集搭建,基于 0.9.1.1

下載下傳位址  http://pan.baidu.com/s/1hrJBjlq:

修改 conf/druid/_common 内的 common.runtime.properties,參考如下配置:

  1. #
  2. # Licensed to Metamarkets Group Inc. (Metamarkets) under one
  3. # or more contributor license agreements. See the NOTICE file
  4. # distributed with this work for additional information
  5. # regarding copyright ownership. Metamarkets licenses this file
  6. # to you under the Apache License, Version 2.0 (the
  7. # "License"); you may not use this file except in compliance
  8. # with the License. You may obtain a copy of the License at
  9. # http://www.apache.org/licenses/LICENSE-2.0
  10. # Unless required by applicable law or agreed to in writing,
  11. # software distributed under the License is distributed on an
  12. # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  13. # KIND, either express or implied. See the License for the
  14. # specific language governing permissions and limitations
  15. # under the License.
  16. # Extensions
  17. # This is not the full list of Druid extensions, but common ones that people often use. You may need to change this list
  18. # based on your particular setup.
  19. #druid.extensions.loadList=["druid-kafka-eight", "druid-s3-extensions", "druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "mysql-metadata-storage"]
  20. druid.extensions.loadList=["mysql-metadata-storage"]
  21. # If you have a different version of Hadoop, place your Hadoop client jar files in your hadoop-dependencies directory
  22. # and uncomment the line below to point to your directory.
  23. #druid.extensions.hadoopDependenciesDir=/my/dir/hadoop-dependencies
  24. # Logging
  25. # Log all runtime properties on startup. Disable to avoid logging properties on startup:
  26. druid.startup.logging.logProperties=true
  27. # Zookeeper
  28. druid.zk.service.host=10.202.4.22:2181
  29. druid.zk.paths.base=/druid
  30. # Metadata storage
  31. # For Derby server on your Druid Coordinator (only viable in a cluster with a single Coordinator, no fail-over):
  32. #druid.metadata.storage.type=derby
  33. #druid.metadata.storage.connector.connectURI=jdbc:derby://metadata.store.ip:1527/var/druid/metadata.db;create=true
  34. #druid.metadata.storage.connector.host=metadata.store.ip
  35. #druid.metadata.storage.connector.port=1527
  36. # For MySQL:
  37. druid.metadata.storage.type=mysql
  38. druid.metadata.storage.connector.connectURI=jdbc:mysql://10.202.4.22:3306/druid?characterEncoding=UTF-8
  39. druid.metadata.storage.connector.user=szh
  40. druid.metadata.storage.connector.password=123456
  41. # For PostgreSQL (make sure to additionally include the Postgres extension):
  42. #druid.metadata.storage.type=postgresql
  43. #druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid
  44. #druid.metadata.storage.connector.user=...
  45. #druid.metadata.storage.connector.password=...
  46. # Deep storage
  47. # For local disk (only viable in a cluster if this is a network mount):
  48. druid.storage.type=local
  49. druid.storage.storageDirectory=var/druid/segments
  50. # For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
  51. #druid.storage.type=hdfs
  52. #druid.storage.storageDirectory=/druid/segments
  53. # For S3:
  54. #druid.storage.type=s3
  55. #druid.storage.bucket=your-bucket
  56. #druid.storage.baseKey=druid/segments
  57. #druid.s3.accessKey=...
  58. #druid.s3.secretKey=...
  59. # Indexing service logs
  60. druid.indexer.logs.type=file
  61. druid.indexer.logs.directory=var/druid/indexing-logs
  62. #druid.indexer.logs.type=hdfs
  63. #druid.indexer.logs.directory=/druid/indexing-logs
  64. #druid.indexer.logs.type=s3
  65. #druid.indexer.logs.s3Bucket=your-bucket
  66. #druid.indexer.logs.s3Prefix=druid/indexing-logs
  67. # Service discovery
  68. druid.selectors.indexing.serviceName=druid/overlord
  69. druid.selectors.coordinator.serviceName=druid/coordinator
  70. # Monitoring
  71. druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
  72. druid.emitter=logging
  73. druid.emitter.logging.logLevel=info

0.9.1.1 預設是不帶mysql 擴充的需要自己下載下傳,解壓後放置于extensions 下:

參考文章:

http://druid.io/docs/0.9.1.1/operations/including-extensions.html

擴充包的下載下傳位址

http://druid.io/downloads.html

分别修改5個節點的啟動配置

/conf/druid/${serviceName}/runtime.properties

broker節點:

/conf/druid/broker/runtime.properties

  1. druid.host=10.202.4.22:9102
  2. druid.service=druid/broker
  3. druid.port=9102
  4. # HTTP server threads
  5. druid.broker.http.numConnections=5
  6. druid.server.http.numThreads=25
  7. # Processing threads and buffers
  8. druid.processing.buffer.sizeBytes=32768
  9. druid.processing.numThreads=2
  10. # Query cache
  11. druid.broker.cache.useCache=true
  12. druid.broker.cache.populateCache=true
  13. druid.cache.type=local
  14. druid.cache.sizeInBytes=2000000000

coordinator 節點:

/conf/druid/coordinator/runtime.properties

druid.host=10.202.4.22:8082
druid.service=druid/coordinator
druid.port=8082
           

historical 節點:

/conf/druid/historical/runtime.properties

  1. druid.service=druid/historical
  2. druid.host=10.202.4.22:9002
  3. druid.port=9002
  4. druid.processing.buffer.sizeBytes=6870912
  5. druid.processing.numThreads=7
  6. druid.historical.cache.useCache=false
  7. druid.historical.cache.populateCache=false
  8. # Segment storage
  9. druid.segmentCache.locations=[{"path":"var/druid/segment-cache","maxSize"\:13000000}]
  10. druid.server.maxSize=13000000

middleManager 節點:

/conf/druid/middleManager/runtime.properties

  1. druid.host=10.202.4.22:8091
  2. druid.service=druid/middleManager
  3. druid.port=8091
  4. # Number of tasks per middleManager
  5. druid.worker.capacity=3
  6. # Task launch parameters
  7. druid.indexer.runner.javaOpts=-server -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
  8. druid.indexer.task.baseTaskDir=var/druid/task
  9. druid.processing.buffer.sizeBytes=65536
  10. # Hadoop indexing
  11. druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp
  12. druid.indexer.task.defaultHadoopCoordinates=["org.apache.hadoop:hadoop-client:2.3.0"]

overlord 節點:

/conf/druid/overlord/runtime.properties

  1. druid.service=druid/overlord
  2. druid.host=10.202.4.22:9100
  3. druid.port=9100
  4. #druid.indexer.queue.startDelay=PT30S
  5. druid.indexer.runner.type=remote
  6. druid.indexer.storage.type=metadata

啟動叢集啟動  可以利用我寫的腳本:

  1. #java `cat conf/druid/broker/jvm.config | xargs` -cp conf/druid/_common:conf/druid/broker:lib/* io.druid.cli.Main server broker
  2. function help(){
  3. echo "參數清單"
  4. echo " 參數1 參數2"
  5. echo " serviceName [-f]"
  6. echo "參數1:serviceName: 啟動服務的名字"
  7. echo "serviceName可選項:"
  8. echo "1: broker"
  9. echo "2: coordinator"
  10. echo "3: historical"
  11. echo "4: middleManager"
  12. echo "5: overlord"
  13. echo "參數2:[-f]: 是否前台啟動"
  14. echo "-f:前台啟動,(不加)預設背景啟動"
  15. }
  16. function startService(){
  17. # echo $0
  18. # echo $1
  19. # echo $2
  20. echo $service
  21. if [[ $2 == "-f" ]]; then
  22. echo "前台啟動"
  23. java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/$service:lib/* io.druid.cli.Main server $service
  24. else
  25. echo "背景啟動"
  26. nohup java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/$service:lib/* io.druid.cli.Main server $service &
  27. fi;
  28. function tips(){
  29. red=`tput setaf 1`
  30. reset=`tput sgr0`
  31. echo "${red}Not correct arguments${reset}"
  32. echo "please use --help or -h for help"
  33. if [[ $1 == "--help" || $1 == "-h" ]]; then
  34. help
  35. exit
  36. fi
  37. service=$1
  38. case $service in
  39. "broker")
  40. ;;
  41. "coordinator")
  42. "historical")
  43. "middleManager")
  44. "overlord")
  45. *)
  46. tips
  47. esac
  48. if [[ $2 == "-f" || $2 == "" ]]; then
  49. startService $1 $2;
  50. else

将上述腳本放到 druid的根目錄即可:

啟動效果如圖:

多台機器的叢集搭建:

上面是單台機器的叢集搭建, 擴充到多台

隻需要修改 conf/druid/(broker | coordinator | historical | middleManager | overlord) 的 runtime.properties 中的 

druid.host=10.202.4.22:9102 改成其他機器的IP位址即可

=========================================================

最後共享下我成功搭建的本地叢集 (内含個人寫的叢集啟動腳本):

http://pan.baidu.com/s/1bJjFzg

基于版本 0.9.1.1,  中繼資料存儲用的mysql (擴充包已經下載下傳好了)

下載下傳并解壓 我搭建的druid.tgz 的druid本地叢集後:

1.修改各個配置檔案

conf/druid/broker/runtime.properties 

conf/druid/coordinator/runtime.properties 

conf/druid/historical/runtime.properties 

conf/druid/middleManager/runtime.properties 

conf/druid/overlord/runtime.properties 

druid.host=10.202.4.22:9102 改成自己的IP位址

2.修改 conf/druid/_common/common.runtime.properties  中的mysql端口位址, 使用者名,密碼, zookeeper位址等 替換為自己的位址。

Tips:mysql 沒有建立庫druid 要自己先建立(安裝mysql資料庫,并建立druid資料庫和druid使用者)

3.用根目錄下的 cluster_start_service.sh 啟動5個節點的服務

   這時進入mysql資料庫 會發現druid自動建立了幾張表

4.進行測試

(1)資料導入

(2)資料查詢

切換到 test 目錄,下面有兩個檔案夾:

(1)資料導入:

切換到 test_load 的目錄下

下面的 submit_csv_task.sh , submit_json_task.sh 分别是送出 csv 資料 與 json 資料的測試腳本

這裡需要修改 env.sh

設定 overlord_ip 為自己的 overlord_ip 的 位址端口

這裡試下執行 submit_json_task.sh 腳本,去頁面上檢視 導入任務執行的狀态 ,如圖所示:

輸入的位址為 overlord 節點的位址+端口:

等待任務成功執行完成:

切換到查詢測試目錄,執行查詢腳本: 

繼續閱讀