druid.io 是一个比较重型的数据库查询系统,分为5种节点 。
在此就不对数据库进行介绍了,如果有疑问请参考白皮书:
http://pan.baidu.com/s/1eSFlIJS
单台机器的集群搭建
首先说一下通用的集群搭建,基于 0.9.1.1
下载地址 http://pan.baidu.com/s/1hrJBjlq:
修改 conf/druid/_common 内的 common.runtime.properties,参考如下配置:
- #
- # Licensed to Metamarkets Group Inc. (Metamarkets) under one
- # or more contributor license agreements. See the NOTICE file
- # distributed with this work for additional information
- # regarding copyright ownership. Metamarkets licenses this file
- # to you under the Apache License, Version 2.0 (the
- # "License"); you may not use this file except in compliance
- # with the License. You may obtain a copy of the License at
- # http://www.apache.org/licenses/LICENSE-2.0
- # Unless required by applicable law or agreed to in writing,
- # software distributed under the License is distributed on an
- # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- # KIND, either express or implied. See the License for the
- # specific language governing permissions and limitations
- # under the License.
- # Extensions
- # This is not the full list of Druid extensions, but common ones that people often use. You may need to change this list
- # based on your particular setup.
- #druid.extensions.loadList=["druid-kafka-eight", "druid-s3-extensions", "druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "mysql-metadata-storage"]
- druid.extensions.loadList=["mysql-metadata-storage"]
- # If you have a different version of Hadoop, place your Hadoop client jar files in your hadoop-dependencies directory
- # and uncomment the line below to point to your directory.
- #druid.extensions.hadoopDependenciesDir=/my/dir/hadoop-dependencies
- # Logging
- # Log all runtime properties on startup. Disable to avoid logging properties on startup:
- druid.startup.logging.logProperties=true
- # Zookeeper
- druid.zk.service.host=10.202.4.22:2181
- druid.zk.paths.base=/druid
- # Metadata storage
- # For Derby server on your Druid Coordinator (only viable in a cluster with a single Coordinator, no fail-over):
- #druid.metadata.storage.type=derby
- #druid.metadata.storage.connector.connectURI=jdbc:derby://metadata.store.ip:1527/var/druid/metadata.db;create=true
- #druid.metadata.storage.connector.host=metadata.store.ip
- #druid.metadata.storage.connector.port=1527
- # For MySQL:
- druid.metadata.storage.type=mysql
- druid.metadata.storage.connector.connectURI=jdbc:mysql://10.202.4.22:3306/druid?characterEncoding=UTF-8
- druid.metadata.storage.connector.user=szh
- druid.metadata.storage.connector.password=123456
- # For PostgreSQL (make sure to additionally include the Postgres extension):
- #druid.metadata.storage.type=postgresql
- #druid.metadata.storage.connector.connectURI=jdbc:postgresql://db.example.com:5432/druid
- #druid.metadata.storage.connector.user=...
- #druid.metadata.storage.connector.password=...
- # Deep storage
- # For local disk (only viable in a cluster if this is a network mount):
- druid.storage.type=local
- druid.storage.storageDirectory=var/druid/segments
- # For HDFS (make sure to include the HDFS extension and that your Hadoop config files in the cp):
- #druid.storage.type=hdfs
- #druid.storage.storageDirectory=/druid/segments
- # For S3:
- #druid.storage.type=s3
- #druid.storage.bucket=your-bucket
- #druid.storage.baseKey=druid/segments
- #druid.s3.accessKey=...
- #druid.s3.secretKey=...
- # Indexing service logs
- druid.indexer.logs.type=file
- druid.indexer.logs.directory=var/druid/indexing-logs
- #druid.indexer.logs.type=hdfs
- #druid.indexer.logs.directory=/druid/indexing-logs
- #druid.indexer.logs.type=s3
- #druid.indexer.logs.s3Bucket=your-bucket
- #druid.indexer.logs.s3Prefix=druid/indexing-logs
- # Service discovery
- druid.selectors.indexing.serviceName=druid/overlord
- druid.selectors.coordinator.serviceName=druid/coordinator
- # Monitoring
- druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]
- druid.emitter=logging
- druid.emitter.logging.logLevel=info
0.9.1.1 默认是不带mysql 扩展的需要自己下载,解压后放置于extensions 下:
参考文章:
http://druid.io/docs/0.9.1.1/operations/including-extensions.html
扩展包的下载地址
http://druid.io/downloads.html
分别修改5个节点的启动配置
/conf/druid/${serviceName}/runtime.properties
broker节点:
/conf/druid/broker/runtime.properties
- druid.host=10.202.4.22:9102
- druid.service=druid/broker
- druid.port=9102
- # HTTP server threads
- druid.broker.http.numConnections=5
- druid.server.http.numThreads=25
- # Processing threads and buffers
- druid.processing.buffer.sizeBytes=32768
- druid.processing.numThreads=2
- # Query cache
- druid.broker.cache.useCache=true
- druid.broker.cache.populateCache=true
- druid.cache.type=local
- druid.cache.sizeInBytes=2000000000
coordinator 节点:
/conf/druid/coordinator/runtime.properties
druid.host=10.202.4.22:8082
druid.service=druid/coordinator
druid.port=8082
historical 节点:
/conf/druid/historical/runtime.properties
- druid.service=druid/historical
- druid.host=10.202.4.22:9002
- druid.port=9002
- druid.processing.buffer.sizeBytes=6870912
- druid.processing.numThreads=7
- druid.historical.cache.useCache=false
- druid.historical.cache.populateCache=false
- # Segment storage
- druid.segmentCache.locations=[{"path":"var/druid/segment-cache","maxSize"\:13000000}]
- druid.server.maxSize=13000000
middleManager 节点:
/conf/druid/middleManager/runtime.properties
- druid.host=10.202.4.22:8091
- druid.service=druid/middleManager
- druid.port=8091
- # Number of tasks per middleManager
- druid.worker.capacity=3
- # Task launch parameters
- druid.indexer.runner.javaOpts=-server -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
- druid.indexer.task.baseTaskDir=var/druid/task
- druid.processing.buffer.sizeBytes=65536
- # Hadoop indexing
- druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp
- druid.indexer.task.defaultHadoopCoordinates=["org.apache.hadoop:hadoop-client:2.3.0"]
overlord 节点:
/conf/druid/overlord/runtime.properties
- druid.service=druid/overlord
- druid.host=10.202.4.22:9100
- druid.port=9100
- #druid.indexer.queue.startDelay=PT30S
- druid.indexer.runner.type=remote
- druid.indexer.storage.type=metadata
启动集群启动 可以利用我写的脚本:
- #java `cat conf/druid/broker/jvm.config | xargs` -cp conf/druid/_common:conf/druid/broker:lib/* io.druid.cli.Main server broker
- function help(){
- echo "参数列表"
- echo " 参数1 参数2"
- echo " serviceName [-f]"
- echo "参数1:serviceName: 启动服务的名字"
- echo "serviceName可选项:"
- echo "1: broker"
- echo "2: coordinator"
- echo "3: historical"
- echo "4: middleManager"
- echo "5: overlord"
- echo "参数2:[-f]: 是否前台启动"
- echo "-f:前台启动,(不加)默认后台启动"
- }
- function startService(){
- # echo $0
- # echo $1
- # echo $2
- echo $service
- if [[ $2 == "-f" ]]; then
- echo "前台启动"
- java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/$service:lib/* io.druid.cli.Main server $service
- else
- echo "后台启动"
- nohup java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath conf/druid/_common:conf/druid/$service:lib/* io.druid.cli.Main server $service &
- fi;
- function tips(){
- red=`tput setaf 1`
- reset=`tput sgr0`
- echo "${red}Not correct arguments${reset}"
- echo "please use --help or -h for help"
- if [[ $1 == "--help" || $1 == "-h" ]]; then
- help
- exit
- fi
- service=$1
- case $service in
- "broker")
- ;;
- "coordinator")
- "historical")
- "middleManager")
- "overlord")
- *)
- tips
- esac
- if [[ $2 == "-f" || $2 == "" ]]; then
- startService $1 $2;
- else
将上述脚本放到 druid的根目录即可:
启动效果如图:
多台机器的集群搭建:
上面是单台机器的集群搭建, 扩展到多台
只需要修改 conf/druid/(broker | coordinator | historical | middleManager | overlord) 的 runtime.properties 中的
druid.host=10.202.4.22:9102 改成其他机器的IP地址即可
=========================================================
最后共享下我成功搭建的本地集群 (内含个人写的集群启动脚本):
http://pan.baidu.com/s/1bJjFzg
基于版本 0.9.1.1, 元数据存储用的mysql (扩展包已经下载好了)
下载并解压 我搭建的druid.tgz 的druid本地集群后:
1.修改各个配置文件
conf/druid/broker/runtime.properties
conf/druid/coordinator/runtime.properties
conf/druid/historical/runtime.properties
conf/druid/middleManager/runtime.properties
conf/druid/overlord/runtime.properties
druid.host=10.202.4.22:9102 改成自己的IP地址
2.修改 conf/druid/_common/common.runtime.properties 中的mysql端口地址, 用户名,密码, zookeeper地址等 替换为自己的地址。
Tips:mysql 没有创建库druid 要自己先创建(安装mysql数据库,并创建druid数据库和druid用户)
3.用根目录下的 cluster_start_service.sh 启动5个节点的服务
这时进入mysql数据库 会发现druid自动创建了几张表
4.进行测试
(1)数据导入
(2)数据查询
切换到 test 目录,下面有两个文件夹:
(1)数据导入:
切换到 test_load 的目录下
下面的 submit_csv_task.sh , submit_json_task.sh 分别是提交 csv 数据 与 json 数据的测试脚本
这里需要修改 env.sh
设置 overlord_ip 为自己的 overlord_ip 的 地址端口
这里试下执行 submit_json_task.sh 脚本,去页面上查看 导入任务执行的状态 ,如图所示:
输入的地址为 overlord 节点的地址+端口:
等待任务成功执行完成:
切换到查询测试目录,执行查询脚本: